Skip to content

[OPIK-5385]: create eval suites improvement;#6054

Merged
aadereiko merged 9 commits intomainfrom
sashaa/OPIK-5385/create-eval-suites-improvement
Apr 8, 2026
Merged

[OPIK-5385]: create eval suites improvement;#6054
aadereiko merged 9 commits intomainfrom
sashaa/OPIK-5385/create-eval-suites-improvement

Conversation

@aadereiko
Copy link
Copy Markdown
Collaborator

Details

image image image image image

Summary

  • Replaced the centered modal for creating evaluation suites with a resizable sidebar panel featuring a 3-step wizard flow
  • Extracted shared CSV upload dialog component to eliminate duplication across 3 files
  • Refactored AddEditEvaluationSuiteDialog from ~600 lines into a hook + UI component split

Multi-step sidebar wizard

The new AddEvaluationSuiteSidebar guides users through 3 steps:

  1. Name & Description — basic suite info
  2. Evaluation Type — "Regression test" (assertions-based) or "Evaluate against Metric"
  3. Test Data — CSV upload, SDK code snippet with copy button, and collapsible Advanced section (evaluation criteria + global assertions) for regression type

The edit flow still uses the existing AddEditEvaluationSuiteDialog via row actions.

Shared CsvUploadDialog

The CSV upload loading overlay was duplicated in v1/AddEditEvaluationSuiteDialog, v2/AddEditEvaluationSuiteDialog, and the new sidebar. Extracted into CsvUploadDialog and reused in v2 dialog + sidebar.

AddEditEvaluationSuiteDialog refactor

Split into:

  • useEvaluationSuiteForm.ts — state, mutations, validation, file handling
  • AddEditEvaluationSuiteDialog.tsx — pure UI (~250 lines)

ResizableSidePanel enhancement

Added closeButtonPosition prop ("left" | "right", defaults to "left") to support placing the close button on the right side of the header.

Files changed

  • shared/ResizableSidePanel/ResizableSidePanel.tsx — added closeButtonPosition prop
  • EvaluationSuitesPage/EvaluationSuitesPage.tsx — uses sidebar for create
  • AddEditEvaluationSuiteDialog/AddEditEvaluationSuiteDialog.tsx — refactored, uses hook + shared CSV dialog
  • AddEditEvaluationSuiteDialog/useEvaluationSuiteForm.tsnew, extracted form logic
  • AddEvaluationSuiteSidebar/AddEvaluationSuiteSidebar.tsxnew, multi-step wizard
  • CsvUploadDialog/CsvUploadDialog.tsxnew, shared upload modal

Change checklist

  • User facing
  • Documentation update

Issues

  • Resolves #
  • OPIK-5385

AI-WATERMARK

  • Tools: Claude Code
  • Model(s): Opus

Testing

  • manual flows
  • npm run lint
  • npm run typecheck

Documentation

@aadereiko aadereiko requested a review from a team as a code owner April 2, 2026 10:47
Comment on lines +407 to +415
<ConfirmDialog
open={confirmOpen}
setOpen={setConfirmOpen}
onCancel={submitHandler}
title="File can't be uploaded"
description="This file cannot be uploaded because it does not pass validation. If you continue, the evaluation suite will be created without any items. You can add items manually later, or go back and upload a valid file."
cancelText="Create empty evaluation suite"
confirmText="Go back"
/>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ConfirmDialog is duplicated in AddEditEvaluationSuiteDialog, should we extract a CsvUploadValidationConfirm component that takes typeLabel, onCancel, and setOpen?

Finding type: Code Dedup and Conventions | Severity: 🟢 Low


Want Baz to fix this for you? Activate Fixer

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd leave it. The ConfirmDialog usage is just 8 lines of props in each place, and the two instances have slightly different description text

@awkoy awkoy added the test-environment Deploy Opik adhoc environment label Apr 2, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

🔄 Test environment deployment process has started

Phase 1: Deploying base version 1.10.59-4793 (from main branch) if environment doesn't exist
Phase 2: Building new images from PR branch sashaa/OPIK-5385/create-eval-suites-improvement
Phase 3: Will deploy newly built version after build completes

You can monitor the progress here.

@CometActions
Copy link
Copy Markdown
Collaborator

Test environment is now available!

To configure additional Environment variables for your environment, run [Deploy Opik AdHoc Environment workflow] (https://github.com/comet-ml/comet-deployment/actions/workflows/deploy_opik_adhoc_env.yaml)

Access Information

The deployment has completed successfully and the version has been verified.

awkoy
awkoy previously approved these changes Apr 2, 2026
Copy link
Copy Markdown
Contributor

@awkoy awkoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@aadereiko aadereiko force-pushed the sashaa/OPIK-5385/create-eval-suites-improvement branch from b9a50b0 to 5ebdaaa Compare April 2, 2026 20:26
Comment on lines +282 to +287
(error: AxiosError, action: "create" | "update") => {
const statusCode = get(error, ["response", "status"]);
const errorMessage =
get(error, ["response", "data", "message"]) ||
get(error, ["response", "data", "errors", 0]) ||
get(error, ["message"]);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we replace response.data.errors[0] with response.data.errors.join('; ') and fall back to message in handleMutationError?
response.data.errors[0] => response.data.errors.join('; ') + fallback to message

Finding type: Provide comprehensive error messages | Severity: 🟢 Low


Want Baz to fix this for you? Activate Fixer

Other fix methods

Fix in Cursor

Prompt for AI Agents:

Before applying, verify this suggestion against the current code. In
apps/opik-frontend/src/v2/pages-shared/datasets/AddEditEvaluationSuiteDialog/useEvaluationSuiteForm.ts
around lines 282 to 287, the handleMutationError function only reads the first error
string (response.data.errors[0]) and so the toast shows only one validation issue.
Change the error extraction so it first checks for response.data.errors being an array
and, if present, joins all strings with '; ' (e.g. errors.join('; ')); if that array is
not present, fall back to response.data.message, then to error.message, and finally to a
generic message. Use that aggregated string as the description in the toast and keep the
existing Conflict handling unchanged.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ok, we use it this way everywhere else in the platform

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks for explaining the existing pattern—I'll keep this convention in mind and save it to memory after the PR is merged.

Comment on lines +131 to +142
<Separator className="mb-4" />
<div className="mb-4">
<h3 className="comet-body-s-accented">Evaluation criteria</h3>
<p className="comet-body-xs text-light-slate">
Define the conditions required for the evaluation to pass
</p>
</div>
<div className="pt-1.5">
<AssertionsField
editableAssertions={assertions}
onChangeEditable={(index, value) => {
setAssertions((prev) => {
const next = [...prev];
next[index] = value;
return next;
});
}}
onRemoveEditable={(index) => {
setAssertions((prev) =>
prev.filter((_, i) => i !== index),
);
}}
onAdd={() => setAssertions((prev) => [...prev, ""])}
placeholder="e.g. Response should be factually accurate and cite sources"
/>
<div className="mb-4 flex gap-4">
<div className="flex flex-1 flex-col gap-1">
<Label
htmlFor="runsPerItem"
className="comet-body-s-accented"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Evaluation criteria and Global assertions blocks duplicate across AddEvaluationSuiteSidebar and EvaluationSuiteItemForm, should we extract a shared EvaluationCriteriaForm component accepting runsInput, thresholdInput, assertions, and callbacks?

Finding types: Code Dedup and Conventions centralize shared logic | Severity: 🟠 Medium


Want Baz to fix this for you? Activate Fixer

Other fix methods

Fix in Cursor

Prompt for AI Agents:

Before applying, verify this suggestion against the current code. In
apps/opik-frontend/src/v2/pages-shared/datasets/AddEditEvaluationSuiteDialog/AddEditEvaluationSuiteDialog.tsx
around lines 133-214, the "Evaluation criteria" / "Global assertions" block duplicates
UI and state-management logic that also appears in AddEvaluationSuiteSidebar.tsx (lines
~223-381) and EvaluationSuiteItemForm.tsx (lines ~165-260). Extract this repeated JSX
and its related validation/inputs into a single shared component (e.g.
EvaluationCriteriaForm) that accepts props for runsInput, thresholdInput, assertions,
setAssertions, isEdit, and any label/placeholder text and returns the same UI and
callbacks. Replace the duplicated blocks in all three files with the new component and
update imports; ensure existing behavior (including onChange handlers, add/remove
assertion logic, and any accessibility ids) and styling are preserved and tests compile.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ok, one of them is used in forms, the other is just simply used, so no need for a shared component;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, that makes sense given the different usage contexts—I'll keep this in mind and save it to memory once the PR is merged.

@CometActions
Copy link
Copy Markdown
Collaborator

🌙 Nightly cleanup: The test environment for this PR (pr-6054) has been cleaned up to free cluster resources. PVCs are preserved — re-deploy to restore the environment.

@CometActions CometActions removed the test-environment Deploy Opik adhoc environment label Apr 3, 2026
@CometActions
Copy link
Copy Markdown
Collaborator

🌙 Nightly cleanup: The test environment for this PR (pr-6054) has been cleaned up to free cluster resources. PVCs are preserved — re-deploy to restore the environment.

4 similar comments
@CometActions
Copy link
Copy Markdown
Collaborator

🌙 Nightly cleanup: The test environment for this PR (pr-6054) has been cleaned up to free cluster resources. PVCs are preserved — re-deploy to restore the environment.

@CometActions
Copy link
Copy Markdown
Collaborator

🌙 Nightly cleanup: The test environment for this PR (pr-6054) has been cleaned up to free cluster resources. PVCs are preserved — re-deploy to restore the environment.

@CometActions
Copy link
Copy Markdown
Collaborator

🌙 Nightly cleanup: The test environment for this PR (pr-6054) has been cleaned up to free cluster resources. PVCs are preserved — re-deploy to restore the environment.

@CometActions
Copy link
Copy Markdown
Collaborator

🌙 Nightly cleanup: The test environment for this PR (pr-6054) has been cleaned up to free cluster resources. PVCs are preserved — re-deploy to restore the environment.

@aadereiko aadereiko force-pushed the sashaa/OPIK-5385/create-eval-suites-improvement branch from 5ebdaaa to 0c9c15b Compare April 8, 2026 12:46
awkoy
awkoy previously approved these changes Apr 8, 2026
Copy link
Copy Markdown
Contributor

@awkoy awkoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment on lines +128 to +139
<div className="flex shrink-0 items-center justify-end gap-2 border-t px-6 py-4">
<Button
variant="outline"
size="sm"
onClick={() => {
if (isDirty) {
requestConfirm(onClose);
} else {
onClose();
}
}}
>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The relocated Save button is outside the form and can't access form.formState.isSubmitting, should we expose isSubmitting from EvaluationSuiteItemFormContainer and wire it into the footer button to disable/show a spinner per .agents/skills/opik-frontend/forms.md?

Finding type: AI Coding Guidelines | Severity: 🔴 High


Want Baz to fix this for you? Activate Fixer

Other fix methods

Fix in Cursor

Prompt for AI Agents:

Before applying, verify this suggestion against the current code. In
apps/opik-frontend/src/v1/pages/EvaluationSuiteItemsPage/EvaluationSuiteItemPanel/AddEvaluationSuiteItemPanel.tsx
around lines 128-149, the Save button was moved outside the form so it cannot observe
form.formState.isSubmitting and remains enabled during submission. Modify
EvaluationSuiteItemFormContainer to accept a new callback prop (e.g.,
onSubmittingChange: (isSubmitting: boolean) => void) and have the form container call
that callback whenever the internal form's isSubmitting changes. Then in this file, wire
that prop to capture isSubmitting into state and use it to disable the footer Save
button and show a loading/spinner state while submitting. Ensure the callback is called
on mount with the initial isSubmitting and on every submit lifecycle change so the
button reliably reflects the form state.

@aadereiko aadereiko merged commit 04681a7 into main Apr 8, 2026
10 checks passed
@aadereiko aadereiko deleted the sashaa/OPIK-5385/create-eval-suites-improvement branch April 8, 2026 14:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants