Skip to content

Making tracing SDK metrics aware #381

@sfriberg

Description

@sfriberg

Making Tracing API metrics aware

One interesting aspect of Open Telemetry is the goal to provide tracing, metrics, and later a logging API. Right now these APIs are fairly separate so as an end-user or library owner I would need to write code to add a span and some more code to add a metric. This is not different that I would need to do today with OpenCensus or with OpenTracing and a metrics library (Prometheus, DropWizard, etc.), but I think OT would allow this to be simplified due to it's all in one nature.

With the metrics and API that takes a Measurement it feels like the integration with Spans would come fairly natural.

Why

Single code point extension, get metrics for "free".

As a end-user I want to write a little code as possible. Instead of couple of lines for metrics, a few more for tracing and one for logging, if I can get it all done in a single place that would be helpful.

Increased uptake of metrics library

Easily getting metrics from spans would be one further path that we could increase the uptake of the metrics library. Switching from OpenCensus and OpenTelemetry is pretty much required for tracing, but metrics doesn't have the same forced migration and will face a much more mature and larger ecosystem.

Span exploration

Currently the naming of Spans is rather close to that of using labels in metrics names, it ends up being an explosion in names that all have a different path, instead of having a HTTP request span that you can easily filter for properties you are interested in.

Filtering and sampling of Traces and Spans using metric data

One of the longer term interesting aspects of this closer relationship will be that metrics could be used by spans/traces do decide if a span should be after the trace is complete/tail-based sampling. Since the metrics name is know (or the same as the span name) as well as labels used for the metric you would be able to easily query a metrics db to get for example the 99%tile for the last 24 hour and explicitly store all of those metrics as well as any randomly sampled traces.

How

Others will probably have better suggestion, but an initial idea would be to have the constructor/builder pattern simply allow, generateMetric(boolean) as part of building the Span.

As for labels I think it would be interesting to discuss if labels should be part of the span. One could think of it as being different layers of contextual data for a span. Top layer would be Resources (process level labels), Labels (general request labels), and Attributed (request unique (or high cardinality) labels). Metrics and spans would then be exported with these as appropriate.

Naming of spans could be the same as the metric for easy association and lookup, but for backwards compatibility we could allow a Operation Name the constructed through a templated string that could read a label, such as {{label.method}}:{{label:parameterized_url}} for a HTTP request span.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:sdkRelated to the SDKspec:metricsRelated to the specification/metrics directorytriage:deciding:community-feedbackOpen to community discussion. If the community can provide sufficient reasoning, it may be accepted

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions