ai

Run Evaluation

Run model and agent evaluations against test cases and rubrics.

ProviderCostLatencyReliabilityTrustRiskPermissions
Anthropic
planned MCP
Verified
$1.435s96%94/100lowREAD_DATA
{
  "capability": "run-evaluation",
  "example_agent_query": "Find providers for run-evaluation",
  "providers": [
    "anthropic"
  ]
}