Skip to content

Conversation

@mahadzaryab1
Copy link
Collaborator

@mahadzaryab1 mahadzaryab1 commented Dec 29, 2025

Which problem is this PR solving?

Description of the changes

  • The query service forwards all attributes in the FindTraces request as strings. In order to have ClickHouse query the correct Nested column for the attribute, this PR adds the attribute_metadata table which contains a mapping from the attribute name to the attribute type. This table is populated through a materialized view.

How was this change tested?

Spun up a ClickHouse server on my local machine and ran the scripts to set up all the tables. Then, ran a script to populate the spans table. Querying the attribute_metadata table correctly returns a mapping of all the attributes.

SELECT *
FROM attribute_metadata

Query id: ff9e0fc7-e9c9-4c22-ae80-b727f9648a30

    ┌─attribute_key──────────┬─type───┐
 1. │ authenticated          │ bool   │
 2. │ browser.name           │ str    │
 3. │ browser.version        │ str    │
 4. │ cache_hit              │ bool   │
 5. │ cached_response        │ bool   │
 6. │ checkout_time          │ double │
 7. │ component              │ str    │
 8. │ container.config       │ bytes  │
 9. │ cpu_usage              │ double │
10. │ db.system              │ str    │
11. │ deployment.config      │ bytes  │
12. │ deployment.environment │ bool   │
13. │ error.type             │ str    │
14. │ error_context          │ map    │
15. │ error_details          │ slice  │
16. │ error_rate             │ double │
17. │ host.arch              │ double │
18. │ http.method            │ str    │
19. │ http.url               │ str    │
20. │ idempotent             │ bool   │
21. │ items_count            │ int    │
22. │ k8s.namespace          │ str    │
23. │ k8s.pod.name           │ str    │
24. │ latency                │ double │
25. │ memory_usage           │ double │
26. │ metadata               │ map    │
27. │ order_id               │ int    │
28. │ order_payload          │ bytes  │
29. │ payment_successful     │ bool   │
30. │ process.pid            │ int    │
31. │ request_body           │ bytes  │
32. │ request_size           │ int    │
33. │ response_snippet       │ bytes  │
34. │ response_time          │ double │
35. │ retry_attempted        │ bool   │
36. │ retry_count            │ int    │
37. │ service.version        │ str    │
38. │ tags                   │ slice  │
39. │ telemetry.sdk.version  │ double │
40. │ timeout_ms             │ int    │
41. │ user_id                │ int    │
    └────────────────────────┴────────┘

Checklist

Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
@mahadzaryab1 mahadzaryab1 requested a review from a team as a code owner December 29, 2025 01:06
@mahadzaryab1 mahadzaryab1 added the changelog:experimental Change to an experimental part of the code label Dec 29, 2025
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
@codecov
Copy link

codecov bot commented Dec 29, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.47%. Comparing base (9658822) to head (02c1676).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7798      +/-   ##
==========================================
+ Coverage   95.35%   95.47%   +0.11%     
==========================================
  Files         310      307       -3     
  Lines       16075    15892     -183     
==========================================
- Hits        15329    15173     -156     
+ Misses        578      564      -14     
+ Partials      168      155      -13     
Flag Coverage Δ
badger_v1 9.20% <0.00%> (+0.17%) ⬆️
badger_v2 1.93% <0.00%> (+0.03%) ⬆️
cassandra-4.x-v1-manual 13.61% <0.00%> (+0.25%) ⬆️
cassandra-4.x-v2-auto 1.92% <0.00%> (+0.03%) ⬆️
cassandra-4.x-v2-manual 1.92% <0.00%> (+0.03%) ⬆️
cassandra-5.x-v1-manual 13.61% <0.00%> (+0.25%) ⬆️
cassandra-5.x-v2-auto 1.92% <0.00%> (+0.03%) ⬆️
cassandra-5.x-v2-manual 1.92% <0.00%> (+0.03%) ⬆️
clickhouse 1.85% <0.00%> (+0.03%) ⬆️
elasticsearch-6.x-v1 17.58% <0.00%> (+0.33%) ⬆️
elasticsearch-7.x-v1 17.61% <0.00%> (+0.33%) ⬆️
elasticsearch-8.x-v1 17.76% <0.00%> (+0.33%) ⬆️
elasticsearch-8.x-v2 1.93% <0.00%> (+0.03%) ⬆️
elasticsearch-9.x-v2 1.93% <0.00%> (+0.03%) ⬆️
grpc_v1 8.86% <0.00%> (-0.05%) ⬇️
grpc_v2 1.93% <0.00%> (+0.03%) ⬆️
kafka-3.x-v2 1.93% <0.00%> (+0.03%) ⬆️
memory_v2 1.93% <0.00%> (+0.03%) ⬆️
opensearch-1.x-v1 17.65% <0.00%> (+0.33%) ⬆️
opensearch-2.x-v1 17.65% <0.00%> (+0.33%) ⬆️
opensearch-2.x-v2 1.93% <0.00%> (+0.03%) ⬆️
opensearch-3.x-v2 1.93% <0.00%> (+0.03%) ⬆️
query 1.93% <0.00%> (+0.03%) ⬆️
tailsampling-processor 0.56% <0.00%> (+<0.01%) ⬆️
unittests 94.10% <100.00%> (+0.18%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

CREATE MATERIALIZED VIEW IF NOT EXISTS attribute_metadata_mv TO attribute_metadata AS
SELECT
attribute_key,
type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding level: (resource | scope | span)?

I think the high level plan for the UI is to ask user to be explicit about which tags they are searching for by specifying a prefix like resource/{key}, so this information will be available in the reader and it can pinpoint the metadata more accurately.

We could also go all the way to capturing service name and span name, since strictly speaking an attribute X in different spans in different services do not have to mean the same thing of have the same type, and our query typically requires at least the service name. The only hesitation I have for that is if it will introduce too much overhead in CH for maintaining the materialized view.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! I just added the level for now.

IF NOT EXISTS attribute_metadata (
attribute_key String,
type String -- 'bool', 'double', 'int', 'string', 'bytes', 'map', 'slice'
) ENGINE = MergeTree
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MergeTree is not going to deduplicate, correct? Is that what we want? Gemini suggests

CREATE TABLE IF NOT EXISTS attribute_metadata (
    attribute_key String,
    type String
) ENGINE = ReplacingMergeTree() -- Background deduplication
ORDER BY (attribute_key, type);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the callout - we use ReplacingMergeTree for services and operations too! I actually learnt today that even with that engine, we need to use the FINAL keyword to perform the merge when querying (see https://clickhouse.com/docs/sql-reference/statements/select/from#final-modifier). We need to fix this for services and operations as well.

WHERE
length(int_attributes.key) > 0
UNION ALL
SELECT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another Gemini suggestion to reduce the number of UNION steps:

SELECT 
        arrayJoin(arrayConcat(str_attributes.key, resource_str_attributes.key)) as attribute_key,
        'str' as type
    FROM spans

do we not handle scope attributes?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! And added scope attributes as well.

Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
@github-actions
Copy link

github-actions bot commented Dec 30, 2025

Metrics Comparison Summary

Total changes across all snapshots: 73

Detailed changes per snapshot

summary_metrics_snapshot_elasticsearch

📊 Metrics Diff Summary

Total Changes: 73

  • 🆕 Added: 73 metrics
  • ❌ Removed: 0 metrics
  • 🔄 Modified: 0 metrics

🆕 Added Metrics

  • jaeger_storage_latency_seconds (18 variants)
View diff sample
+jaeger_storage_latency_seconds{le="+Inf",name="some_storage",operation="find_traces",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",result="err",role="tracestore"}
+jaeger_storage_latency_seconds{le="0",name="some_storage",operation="find_traces",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",result="err",role="tracestore"}
+jaeger_storage_latency_seconds{le="10",name="some_storage",operation="find_traces",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",result="err",role="tracestore"}
+jaeger_storage_latency_seconds{le="100",name="some_storage",operation="find_traces",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",result="err",role="tracestore"}
+jaeger_storage_latency_seconds{le="1000",name="some_storage",operation="find_traces",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",result="err",role="tracestore"}
+jaeger_storage_latency_seconds{le="10000",name="some_storage",operation="find_traces",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",result="err",role="tracestore"}
+jaeger_storage_latency_seconds{le="25",name="some_storage",operation="find_traces",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",result="err",role="tracestore"}
...
- `jaeger_storage_requests` (1 variants)
View diff sample
+jaeger_storage_requests{name="some_storage",operation="find_traces",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",result="err",role="tracestore"}
- `rpc_server_duration_milliseconds` (18 variants)
View diff sample
+rpc_server_duration_milliseconds{le="+Inf",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc",otel_scope_schema_url="https://opentelemetry.io/schemas/1.37.0",otel_scope_version="0.64.0",rpc_grpc_status_code="2",rpc_method="FindTraces",rpc_service="jaeger.api_v3.QueryService",rpc_system="grpc"}
+rpc_server_duration_milliseconds{le="0",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc",otel_scope_schema_url="https://opentelemetry.io/schemas/1.37.0",otel_scope_version="0.64.0",rpc_grpc_status_code="2",rpc_method="FindTraces",rpc_service="jaeger.api_v3.QueryService",rpc_system="grpc"}
+rpc_server_duration_milliseconds{le="10",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc",otel_scope_schema_url="https://opentelemetry.io/schemas/1.37.0",otel_scope_version="0.64.0",rpc_grpc_status_code="2",rpc_method="FindTraces",rpc_service="jaeger.api_v3.QueryService",rpc_system="grpc"}
+rpc_server_duration_milliseconds{le="100",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc",otel_scope_schema_url="https://opentelemetry.io/schemas/1.37.0",otel_scope_version="0.64.0",rpc_grpc_status_code="2",rpc_method="FindTraces",rpc_service="jaeger.api_v3.QueryService",rpc_system="grpc"}
+rpc_server_duration_milliseconds{le="1000",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc",otel_scope_schema_url="https://opentelemetry.io/schemas/1.37.0",otel_scope_version="0.64.0",rpc_grpc_status_code="2",rpc_method="FindTraces",rpc_service="jaeger.api_v3.QueryService",rpc_system="grpc"}
+rpc_server_duration_milliseconds{le="10000",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc",otel_scope_schema_url="https://opentelemetry.io/schemas/1.37.0",otel_scope_version="0.64.0",rpc_grpc_status_code="2",rpc_method="FindTraces",rpc_service="jaeger.api_v3.QueryService",rpc_system="grpc"}
+rpc_server_duration_milliseconds{le="25",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc",otel_scope_schema_url="https://opentelemetry.io/schemas/1.37.0",otel_scope_version="0.64.0",rpc_grpc_status_code="2",rpc_method="FindTraces",rpc_service="jaeger.api_v3.QueryService",rpc_system="grpc"}
...
- `rpc_server_requests_per_rpc` (18 variants)
View diff sample
+rpc_server_requests_per_rpc{le="+Inf",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc",otel_scope_schema_url="https://opentelemetry.io/schemas/1.37.0",otel_scope_version="0.64.0",rpc_grpc_status_code="2",rpc_method="FindTraces",rpc_service="jaeger.api_v3.QueryService",rpc_system="grpc"}
+rpc_server_requests_per_rpc{le="0",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc",otel_scope_schema_url="https://opentelemetry.io/schemas/1.37.0",otel_scope_version="0.64.0",rpc_grpc_status_code="2",rpc_method="FindTraces",rpc_service="jaeger.api_v3.QueryService",rpc_system="grpc"}
+rpc_server_requests_per_rpc{le="10",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc",otel_scope_schema_url="https://opentelemetry.io/schemas/1.37.0",otel_scope_version="0.64.0",rpc_grpc_status_code="2",rpc_method="FindTraces",rpc_service="jaeger.api_v3.QueryService",rpc_system="grpc"}
+rpc_server_requests_per_rpc{le="100",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc",otel_scope_schema_url="https://opentelemetry.io/schemas/1.37.0",otel_scope_version="0.64.0",rpc_grpc_status_code="2",rpc_method="FindTraces",rpc_service="jaeger.api_v3.QueryService",rpc_system="grpc"}
+rpc_server_requests_per_rpc{le="1000",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc",otel_scope_schema_url="https://opentelemetry.io/schemas/1.37.0",otel_scope_version="0.64.0",rpc_grpc_status_code="2",rpc_method="FindTraces",rpc_service="jaeger.api_v3.QueryService",rpc_system="grpc"}
+rpc_server_requests_per_rpc{le="10000",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc",otel_scope_schema_url="https://opentelemetry.io/schemas/1.37.0",otel_scope_version="0.64.0",rpc_grpc_status_code="2",rpc_method="FindTraces",rpc_service="jaeger.api_v3.QueryService",rpc_system="grpc"}
+rpc_server_requests_per_rpc{le="25",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc",otel_scope_schema_url="https://opentelemetry.io/schemas/1.37.0",otel_scope_version="0.64.0",rpc_grpc_status_code="2",rpc_method="FindTraces",rpc_service="jaeger.api_v3.QueryService",rpc_system="grpc"}
...
- `rpc_server_responses_per_rpc` (18 variants)
View diff sample
+rpc_server_responses_per_rpc{le="+Inf",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc",otel_scope_schema_url="https://opentelemetry.io/schemas/1.37.0",otel_scope_version="0.64.0",rpc_grpc_status_code="2",rpc_method="FindTraces",rpc_service="jaeger.api_v3.QueryService",rpc_system="grpc"}
+rpc_server_responses_per_rpc{le="0",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc",otel_scope_schema_url="https://opentelemetry.io/schemas/1.37.0",otel_scope_version="0.64.0",rpc_grpc_status_code="2",rpc_method="FindTraces",rpc_service="jaeger.api_v3.QueryService",rpc_system="grpc"}
+rpc_server_responses_per_rpc{le="10",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc",otel_scope_schema_url="https://opentelemetry.io/schemas/1.37.0",otel_scope_version="0.64.0",rpc_grpc_status_code="2",rpc_method="FindTraces",rpc_service="jaeger.api_v3.QueryService",rpc_system="grpc"}
+rpc_server_responses_per_rpc{le="100",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc",otel_scope_schema_url="https://opentelemetry.io/schemas/1.37.0",otel_scope_version="0.64.0",rpc_grpc_status_code="2",rpc_method="FindTraces",rpc_service="jaeger.api_v3.QueryService",rpc_system="grpc"}
+rpc_server_responses_per_rpc{le="1000",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc",otel_scope_schema_url="https://opentelemetry.io/schemas/1.37.0",otel_scope_version="0.64.0",rpc_grpc_status_code="2",rpc_method="FindTraces",rpc_service="jaeger.api_v3.QueryService",rpc_system="grpc"}
+rpc_server_responses_per_rpc{le="10000",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc",otel_scope_schema_url="https://opentelemetry.io/schemas/1.37.0",otel_scope_version="0.64.0",rpc_grpc_status_code="2",rpc_method="FindTraces",rpc_service="jaeger.api_v3.QueryService",rpc_system="grpc"}
+rpc_server_responses_per_rpc{le="25",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc",otel_scope_schema_url="https://opentelemetry.io/schemas/1.37.0",otel_scope_version="0.64.0",rpc_grpc_status_code="2",rpc_method="FindTraces",rpc_service="jaeger.api_v3.QueryService",rpc_system="grpc"}
...

➡️ View full metrics file

@yurishkuro yurishkuro merged commit a6e3492 into jaegertracing:main Dec 30, 2025
82 of 83 checks passed
@mahadzaryab1 mahadzaryab1 deleted the attribute-metadata branch December 30, 2025 21:12
ThatDeparted2061 pushed a commit to ThatDeparted2061/jaeger that referenced this pull request Dec 31, 2025
…egertracing#7798)

Signed-off-by: ThatDeparted2061 <harshraocodesup@gmail.com>
ThatDeparted2061 pushed a commit to ThatDeparted2061/jaeger that referenced this pull request Jan 5, 2026
…egertracing#7798)

Signed-off-by: ThatDeparted2061 <harshraocodesup@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/storage changelog:experimental Change to an experimental part of the code enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants