-
Notifications
You must be signed in to change notification settings - Fork 2.8k
[fix][clickhouse] Optimize Service and Operation Retrieval Queries #7808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
FINAL Modifier For Performing Merge
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #7808 +/- ##
==========================================
+ Coverage 95.47% 95.53% +0.06%
==========================================
Files 307 307
Lines 15892 15911 +19
==========================================
+ Hits 15173 15201 +28
+ Misses 564 558 -6
+ Partials 155 152 -3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| @@ -224,28 +224,28 @@ LEFT JOIN trace_id_timestamps t ON s.trace_id = t.trace_id | |||
| WHERE 1=1` | |||
|
|
|||
| const SelectServices = ` | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| const SelectServices = ` | |
| // We use FINAL to ensure ClickHouse fully merges the data before returning the result. | |
| // See https://clickhouse.com/docs/sql-reference/statements/select/from#final-modifier | |
| const SelectServices = ` |
however, according to Gemini we should be using distinct here, which is faster than forcing the merge logic.
also, why are you using ReplacingMergeTree and not AggregatingMergeTree? The latter is what we need here, and the query should use GROUP BY serviceName
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yurishkuro I believe that SELECT DISTINCT will keep any unique row whereas using FINAL leverages the deduplication by the MergeTree engine (taking the latest version). In this case, they end up being the same but I thought it made more sense to leverage the engine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same with AggregatingMergeTree - the effect in this case is the same. AggregatingMergeTree is more for "aggregation"-based workflows like sum, avg, etc whereas ReplacingMergeTree is for deduplication. In either case, we need to use the FINAL keyword to perform the merge. The snippet below shows what the queries return with AggregatingMergeTree.
Mahads-MacBook-Air.local :) select * from services
SELECT *
FROM services
Query id: 790fbbe2-d30a-46b4-b24d-350991fe64d2
┌─name──────────┐
1. │ frontend │
2. │ order-service │
3. │ user-service │
└───────────────┘
┌─name──────────┐
4. │ frontend │
5. │ order-service │
6. │ user-service │
└───────────────┘
6 rows in set. Elapsed: 0.005 sec.
Mahads-MacBook-Air.local :) select * from services final
SELECT *
FROM services
FINAL
Query id: 3fa57a0e-83e7-4127-9d6e-ca23069b30b4
┌─name──────────┐
1. │ frontend │
2. │ order-service │
3. │ user-service │
└───────────────┘
3 rows in set. Elapsed: 0.008 sec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
FINAL Modifier For Performing Merge


Description of the changes
GROUP BYclause to deduplicate at query time instead of DISTINCTHow was this change tested?
Checklist
jaeger:make lint testjaeger-ui:npm run lintandnpm run test