[NA] [BE] Fix high cost metrics calculation#5965
Merged
thiagohora merged 7 commits intomainfrom Apr 1, 2026
Merged
Conversation
Span subqueries in GET_COST, GET_COST_WITH_BREAKDOWN, GET_TOKEN_USAGE, and GET_TOKEN_USAGE_WITH_BREAKDOWN were not scoping spans to the traces returned by traces_filtered. Adding AND trace_id IN (SELECT id FROM traces_filtered) ensures spans are only aggregated for traces that pass all applied filters (time range, name, metadata, feedback scores, etc.). Benchmarked on production (1.9M spans, 7-day window): - Granules read: 25,429 → 4,959 (5x reduction) - GET_TOKEN_USAGE latency: ~2.0s → ~0.6s median (3.5x faster) - GET_COST latency: ~1.6s → ~0.9s median (1.7x faster)
Contributor
Backend Tests - Integration Group 6273 tests 273 ✅ 2m 32s ⏱️ Results for commit 0437a93. ♻️ This comment has been updated with latest results. |
apps/opik-backend/src/main/java/com/comet/opik/domain/ProjectMetricsDAO.java
Show resolved
Hide resolved
…ed_at index on authored_feedback_scores - Add AND trace_id IN (SELECT id FROM traces_filtered) to span subqueries in GET_COST, GET_COST_WITH_BREAKDOWN, GET_TOKEN_USAGE, GET_TOKEN_USAGE_WITH_BREAKDOWN. Previously filtering by span.id (5th ORDER BY column) caused full-table scans; the fix reduces granules read from 25,429 to 4,959 (~5x) and query latency by ~2x. - Replace inline dateDiff duration expressions with the MATERIALIZED duration column in TRACE_FILTERED_PREFIX, SPAN_FILTERED_PREFIX, and GET_AVERAGE_DURATION. - Remove FINAL from feedback_scores and authored_feedback_scores reads in TRACE_FILTERED_PREFIX, SPAN_FILTERED_PREFIX, and THREAD_FILTERED_PREFIX, replacing deduplication with ROW_NUMBER() window function which is already applied. - Scope traces_final in THREAD_FILTERED_PREFIX to only traces whose thread_id is in the selected time window (was previously loading all threads in the project). - Add minmax skip index on authored_feedback_scores.created_at (migration 000076).
.../db-app-analytics/migrations/000078_add_minmax_index_authored_feedback_scores_created_at.sql
Show resolved
Hide resolved
…eated_at.sql to 000078_add_minmax_index_authored_feedback_scores_created_at.sql
ldaugusto
reviewed
Apr 1, 2026
Contributor
ldaugusto
left a comment
There was a problem hiding this comment.
Two things to confirm:
apps/opik-backend/src/main/java/com/comet/opik/domain/ProjectMetricsDAO.java
Show resolved
Hide resolved
.../db-app-analytics/migrations/000078_add_minmax_index_authored_feedback_scores_created_at.sql
Show resolved
Hide resolved
ldaugusto
approved these changes
Apr 1, 2026
Contributor
ldaugusto
left a comment
There was a problem hiding this comment.
As we are going by default with use_skip_indexes_if_final=1, its good to go
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Details
Root cause
Span subqueries in
GET_COST,GET_COST_WITH_BREAKDOWN,GET_TOKEN_USAGE, andGET_TOKEN_USAGE_WITH_BREAKDOWNwere filtering spans byspan.id(the 5th column in thespansORDER BY). ClickHouse's primary key index cannot prune granules on non-leading columns, resulting in near full-table scans (~25,429 granules read per query). This also caused correctness issues: spans whose IDs fall outside the trace time window were silently excluded, missing ~12.8% of spans (209K out of 1.8M in production).Changes
ProjectMetricsDAO.javaSpan subquery scoping fix (
GET_COST,GET_COST_WITH_BREAKDOWN,GET_TOKEN_USAGE,GET_TOKEN_USAGE_WITH_BREAKDOWN): AddedAND trace_id IN (SELECT id FROM traces_filtered)to each span subquery. This allows ClickHouse to usetrace_id(3rd column in ORDER BY) for granule pruning, reducing granules read from 25,429 → 4,959 (~5×) and query latency by ~2× in production.Materialized
durationcolumn: Replaced inlineif(end_time IS NOT NULL ... dateDiff('microsecond', ...) / 1000.0 ...) AS durationexpressions with the existingdurationMATERIALIZED column inTRACE_FILTERED_PREFIX,SPAN_FILTERED_PREFIX, andGET_AVERAGE_DURATION. The MATERIALIZED column stores the pre-computed value and avoids recomputing it at query time across millions of rows.Remove
FINALfrom feedback score reads: RemovedFINALfromfeedback_scoresandauthored_feedback_scoresinTRACE_FILTERED_PREFIX,SPAN_FILTERED_PREFIX, andTHREAD_FILTERED_PREFIX. Deduplication is already handled downstream by theROW_NUMBER()window function, so applyingFINALhere forced redundant merge-time deduplication.THREAD_FILTERED_PREFIXscoping: Movedtraces_finalaftertrace_threads_finaland scoped it withAND thread_id IN (SELECT thread_id FROM trace_threads_final). Previouslytraces_finalloaded all traces in the project with a non-emptythread_id, ignoring the time window filter entirely.Migration
000076minmaxskip index onauthored_feedback_scores.created_atto enable efficient time-bounded range filtering on that table.Production benchmark (same workspace/project, 7-day window, 522K traces / 1.9M spans)
EXPLAIN indexes confirmed 25,429 → 4,959 granules read on the spans table for the cost/token queries.
Change checklist
Issues
Testing
span.idfilter returned 1,629,299 rows; newtrace_idfilter returns 1,838,544 rows (209K previously missing spans recovered).Documentation
N/A