Evaluation Dashboards
evaluation_dashboards
Methods
Create a new evaluation dashboard for an evaluation or evaluation group
List dashboards filtered by evaluation_id, evaluation_group_id, tags, creators, or search
Filter by creator user IDs
Search in name and tags
Filter by tags (case-insensitive)
Whether there are more items left to be fetched.
The total of items that match the query. This is greater than or equal to the number of items returned.
The maximum number of items to return.
Get a single evaluation dashboard by ID
Partially update dashboard metadata (name, description, widget_order)
Soft delete an evaluation dashboard
Domain types
Widgets
evaluation_dashboards.widgets
Methods
Create a new widget, add it to the dashboard, and compute its results
Update a widget and compute its results. If the widget is only used by this dashboard, it is updated in place. If shared across multiple dashboards, a copy is created.
Remove a widget from the dashboard (does not delete the widget)
Domain types
Computed result for a widget - used in widget creation response
Response model for widget creation - includes widget and computed result
Widget types for dashboard visualizations
Filter clause with conditions connected by logical operators.
Conditions are evaluated left-to-right without precedence (no nesting/parentheses). Example: condition1 AND condition2 OR condition3 evaluates as ((condition1 AND condition2) OR condition3)
Example: { "conditions": [ {"column": "score", "operator": ">", "value": 0.5}, {"column": "category", "operator": "=", "value": "test"} ], "logicalOperators": ["AND"] }
Query that returns a single metric value (used for metric widgets).
Used for widget type: metric. Enforces exactly 1 aggregation in select. Returns: {"type": "metric", "data": ...}
Example SQL equivalent: SELECT AVG(score) as average_score FROM evaluation_items
Column in SELECT clause
Query that returns a series of records (used for table/bar/histogram/donut/scatter widgets).
Used for widget types: table, bar, histogram, donut, scatter. Returns: {"type": "series", "data": [...]}
Example SQL equivalent: SELECT category, AVG(score) as avg_score, COUNT(*) as count FROM evaluation_items WHERE score > 0.5 AND category = 'test' GROUP BY category ORDER BY avg_score DESC LIMIT 100