Evaluation Dashboards

evaluation_dashboards

Methods

Create Evaluation Dashboard ->
post/v5/evaluation-dashboards

Create a new evaluation dashboard for an evaluation or evaluation group

List Evaluation Dashboards -> CursorPage<>
get/v5/evaluation-dashboards

List dashboards filtered by evaluation_id, evaluation_group_id, tags, creators, or search

query Parameters
created_by_ids: Array<string>
Optional

Filter by creator user IDs

ending_before: string
Optional
evaluation_group_id: string
Optional
evaluation_id: string
Optional
include_archived: boolean
Optional
limit: number
Optional
(maximum: 10000, minimum: 1, default: 100)
search: string
Optional

Search in name and tags

sort_by: string
Optional
sort_order:
Optional
starting_after: string
Optional
tags: Array<string>
Optional

Filter by tags (case-insensitive)

Response fields
has_more: boolean

Whether there are more items left to be fetched.

items: Array<>
total: number

The total of items that match the query. This is greater than or equal to the number of items returned.

limit: number
Optional
(default: 100)

The maximum number of items to return.

object: "list"
Optional
(default: "list")
Request example
200Example
Get Evaluation Dashboard ->
get/v5/evaluation-dashboards/{dashboard_id}

Get a single evaluation dashboard by ID

Patch Evaluation Dashboard ->
patch/v5/evaluation-dashboards/{dashboard_id}

Partially update dashboard metadata (name, description, widget_order)

Delete Evaluation Dashboard ->
delete/v5/evaluation-dashboards/{dashboard_id}

Soft delete an evaluation dashboard

Domain types

EvaluationDashboard = { id, account_id, created_at, 13 more... }

evaluation_dashboards.widgets

Methods

Add Widget To Dashboard ->
post/v5/evaluation-dashboards/{dashboard_id}/widgets

Create a new widget, add it to the dashboard, and compute its results

Update Dashboard Widget ->
patch/v5/evaluation-dashboards/{dashboard_id}/widgets/{widget_id}

Update a widget and compute its results. If the widget is only used by this dashboard, it is updated in place. If shared across multiple dashboards, a copy is created.

Remove Widget From Dashboard ->
delete/v5/evaluation-dashboards/{dashboard_id}/widgets/{widget_id}

Remove a widget from the dashboard (does not delete the widget)

Domain types

EvaluationDashboardWidget = { id, account_id, created_at, 6 more... }
EvaluationDashboardWidgetResult = { id, account_id, computation_status, 10 more... }
EvaluationDashboardWidgetResultResponse = { id, computation_status, widget_id, 3 more... }

Computed result for a widget - used in widget creation response

EvaluationDashboardWidgetWithResult = { id, account_id, created_at, 6 more... }

Response model for widget creation - includes widget and computed result

EvaluationWidgetTypeEnum = "bar" | "histogram" | "donut" | 6 more...

Widget types for dashboard visualizations

Filter = { conditions, logicalOperators }

Filter clause with conditions connected by logical operators.

Conditions are evaluated left-to-right without precedence (no nesting/parentheses). Example: condition1 AND condition2 OR condition3 evaluates as ((condition1 AND condition2) OR condition3)

Example: { "conditions": [ {"column": "score", "operator": ">", "value": 0.5}, {"column": "category", "operator": "=", "value": "test"} ], "logicalOperators": ["AND"] }

MetricQuery = { select, evaluation_ids, filter }

Query that returns a single metric value (used for metric widgets).

Used for widget type: metric. Enforces exactly 1 aggregation in select. Returns: {"type": "metric", "data": ...}

Example SQL equivalent: SELECT AVG(score) as average_score FROM evaluation_items

SelectItem = { expression, alias }

Column in SELECT clause

SeriesQuery = { select, evaluation_ids, filter, 3 more... }

Query that returns a series of records (used for table/bar/histogram/donut/scatter widgets).

Used for widget types: table, bar, histogram, donut, scatter. Returns: {"type": "series", "data": [...]}

Example SQL equivalent: SELECT category, AVG(score) as avg_score, COUNT(*) as count FROM evaluation_items WHERE score > 0.5 AND category = 'test' GROUP BY category ORDER BY avg_score DESC LIMIT 100