Evaluations
EvaluationsResource
Methods
Create Evaluation
List Evaluations
Get Evaluation
The identity that created the entity.
The tags associated with the entity
Number of task errors across all items in this evaluation.
Metadata key-value pairs for the evaluation
Progress of the evaluation's underlying async job
Reason for evaluation status
Tasks executed during evaluation. Populated with optional task view.
Archive Evaluation
Update or Restore Evaluation
Get schema information for evaluation item data, including field names, types, and occurrence counts.
Include archived items in schema analysis
The ID of the evaluation
List of all discovered fields, ordered alphabetically by field_name
Total number of evaluation items
Whether schema was computed from a sample of items (for large evaluations)
Number of items sampled for schema inference, if applicable
Filter evaluations using metadata and other criteria. Supports up to 10 filters with AND logic.
Get taxonomy JSON for contributor evaluation question tasks.
Domain types
Schema information for an evaluation's item data structure
Tasks
EvaluationsResource.TasksResource
Methods
Add a new test criteria (LLM judge, contributor question, etc.) to an existing evaluation. Gated: rejected if any contributor annotation task has been claimed or completed. Kicks off the evaluation workflow so the new task runs against existing items.
Replace a single test criteria's configuration, identified by its alias. Gated: rejected if any contributor annotation task for the evaluation has been claimed or completed.