Skip to content

[OPIK-4828] [BE] Fix thread view shows no data message but then eventually loads#5954

Merged
thiagohora merged 14 commits intomainfrom
thiaghora/OPIK-4828-thread-view-no-data-flash
Apr 1, 2026
Merged

[OPIK-4828] [BE] Fix thread view shows no data message but then eventually loads#5954
thiagohora merged 14 commits intomainfrom
thiaghora/OPIK-4828-thread-view-no-data-flash

Conversation

@thiagohora
Copy link
Copy Markdown
Contributor

@thiagohora thiagohora commented Mar 30, 2026

Details

Backend

  • Remove FINAL from feedback_scores and authored_feedback_scores reads in all thread queries — authored_feedback_scores is append-only so FINAL is unnecessary; feedback_scores deduplication is handled downstream by ROW_NUMBER() window function
  • Optimise SELECT_TRACES_THREAD_BY_ID with two-phase CTE: narrow to matching trace IDs first (with FINAL), then read full rows without FINAL using LIMIT 1 BY for deduplication. Benchmarked ~5× faster on production for threads with large trace counts
  • Enable use_skip_indexes_if_final setting on queries that still use FINAL scans

Frontend

  • Remove keepPreviousData from both thread detail and traces queries in ThreadDetailsPanel — this was causing stale data to be shown when switching between threads, giving the impression of a "No Data" flash before the previous thread's content was replaced
  • Fix loading condition from && to || so the loader is shown while either the thread or its traces are still fetching, preventing a partial render with no content

Change checklist

  • User facing changes
  • Documentation update

Issues

https://comet-ml.atlassian.net/browse/OPIK-4828

Testing

Benchmarked on production ClickHouse (opik_prod) against a project with ~28k traces in a single thread:

Query Before After Speedup
SELECT_TRACES_THREAD_BY_ID ~10s ~2s ~5×
SELECT_TRACES_THREADS_BY_PROJECT_IDS ~2.1s ~2.1s neutral

The list query (SELECT_TRACES_THREADS_BY_PROJECT_IDS) shows no wall-clock improvement from removing FINAL on feedback tables (noise-level ~2%), which is expected — the bottleneck is trace aggregation. The FINAL removals reduce unnecessary merge overhead in the pipeline without correctness risk.

Frontend: manually verified that switching between threads no longer shows stale content or a "No Data" flash.

Documentation

No documentation changes required.

@thiagohora thiagohora requested a review from a team as a code owner March 30, 2026 12:04
@github-actions github-actions bot added java Pull requests that update Java code Frontend Backend typescript *.ts *.tsx labels Mar 30, 2026
@comet-ml comet-ml deleted a comment from github-actions bot Mar 30, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 30, 2026

Backend Tests - Integration Group 8

 34 files   34 suites   9m 29s ⏱️
436 tests 433 ✅ 3 💤 0 ❌
423 runs  420 ✅ 3 💤 0 ❌

Results for commit 097d696.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 30, 2026

Backend Tests - Integration Group 6

256 tests   256 ✅  3m 49s ⏱️
 33 suites    0 💤
 33 files      0 ❌

Results for commit 2fe994b.

♻️ This comment has been updated with latest results.

@thiagohora thiagohora changed the title [OPIK-4828] [BE] perf: optimise thread DAO queries [OPIK-4828] [BE] Fix thread view shows no data message but then eventually loads Mar 30, 2026
@thiagohora thiagohora requested a review from awkoy March 30, 2026 12:48
Copy link
Copy Markdown
Member

@andrescrz andrescrz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's double check some comments before moving forward.

@thiagohora thiagohora requested a review from andrescrz March 30, 2026 16:37
…ThreadDetailsPanel

- Remove keepPreviousData from useThreadById and useTracesList to avoid
  showing the previous thread's data when switching between threads
- Change && to || in renderContent loading check so that a Loader is
  shown whenever either query is pending, not only when both are pending
  simultaneously (which caused No Data to appear if traces resolved first)
OPIK-4828: thread_id had no skip index, causing full partition scans
when filtering traces by thread in the thread view. The bloom filter
at 1% false positive rate allows ClickHouse to skip ~99% of irrelevant
granules for thread_id lookups.
Add SETTINGS use_skip_indexes_if_final = 1 so the bloom filter on
traces.thread_id is applied even when reading with FINAL, allowing
ClickHouse to skip irrelevant granules during thread view loading.
…ables in thread queries

Remove FINAL modifier from feedback_scores and authored_feedback_scores
reads across all thread DAO queries. authored_feedback_scores is
append-only so FINAL is unnecessary overhead; feedback_scores reads are
deduplicated via ROW_NUMBER() window function downstream so FINAL is
also redundant.

Also optimise SELECT_TRACES_THREAD_BY_ID with a two-phase CTE approach:
narrow to matching trace IDs first (with FINAL), then read full rows
without FINAL using LIMIT 1 BY for deduplication. Benchmarked ~5x
faster on production for threads with large trace counts.
@thiagohora thiagohora force-pushed the thiaghora/OPIK-4828-thread-view-no-data-flash branch from 9805cff to 097d696 Compare March 30, 2026 17:42
@github-actions github-actions bot added the tests Including test files, or tests related like configuration. label Mar 30, 2026
awkoy
awkoy previously approved these changes Mar 31, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Backend Tests - Integration Group 15

 26 files  ±0   26 suites  ±0   3m 15s ⏱️ -12s
236 tests ±0  235 ✅ ±0  0 💤 ±0  0 ❌ ±0  1 🔥 ±0 
235 runs  ±0  235 ✅ ±0  0 💤 ±0  0 ❌ ±0 

For more details on these errors, see this check.

Results for commit 6fbbcef. ± Comparison against base commit a24607c.

Copy link
Copy Markdown
Member

@andrescrz andrescrz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one comment of something that I don't know if it was left behind intentionally or not.

created_at,
last_updated_at
FROM trace_threads final
FROM trace_threads FINAL
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason to not to change this to dedupe with ORDER BY + LIMIT BY like you changed above?

Copy link
Copy Markdown
Contributor Author

@thiagohora thiagohora Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't change this one because the tests showed that it was slower. I didn't understand exactly why, but there were times when LIMIT + ORDER was actually slower than FINAL

The other makes sense because it filters by thread_id, reducing the result set that needs sorting.

@thiagohora thiagohora merged commit 80c9e64 into main Apr 1, 2026
80 checks passed
@thiagohora thiagohora deleted the thiaghora/OPIK-4828-thread-view-no-data-flash branch April 1, 2026 09:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Backend Frontend java Pull requests that update Java code tests Including test files, or tests related like configuration. typescript *.ts *.tsx

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants