feat(sdk-trace): implement span start/end metrics#6213
feat(sdk-trace): implement span start/end metrics#6213trentm merged 18 commits intoopen-telemetry:mainfrom
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #6213 +/- ##
==========================================
+ Coverage 95.48% 95.52% +0.03%
==========================================
Files 363 367 +4
Lines 11563 11661 +98
Branches 2669 2690 +21
==========================================
+ Hits 11041 11139 +98
Misses 522 522
🚀 New features to boost your workflow:
|
api/CHANGELOG.md
Outdated
|
|
||
| * feat(api): improve isValidSpanId, isValidTraceId performance [#5714](https://github.com/open-telemetry/opentelemetry-js/pull/5714) @seemk | ||
| * feat(diag): change types in `DiagComponentLogger` from `any` to `unknown`[#5478](https://github.com/open-telemetry/opentelemetry-js/pull/5478) @loganrosen | ||
| * feat(sdk-trace): implement span start/end metrics [#1851](https://github.com/open-telemetry/opentelemetry-js/pull/6213) @anuraaga |
There was a problem hiding this comment.
This should be added to the top-dir CHANGELOG.md. This file is for changes to the api/... dir.
| private readonly liveSpans: UpDownCounter; | ||
|
|
||
| constructor(meterProvider: MeterProvider) { | ||
| const meter = meterProvider.getMeter('@opentelemetry/sdk-trace'); |
There was a problem hiding this comment.
This string becomes the InstrumentationScope.name. Typically our instrumentation uses the JS package name for this, which in this case has the "-base" suffix. I can understand wanting to use "sdk-trace" without the "-base":
- the equivalents for metrics and logs are in the
.../sdk-metricsand.../sdk-logspackages (no "-base" on them)
I'm fine with using this name.
@pichlermarc Do you have a particular opinion on this?
| case SamplingDecision.NOT_RECORD: | ||
| return 'DROP'; | ||
| default: | ||
| return 'unknown'; |
There was a problem hiding this comment.
Can a SamplingDecision be any other value? I.e. is the "unknown" fallback necessary? /shrug
There was a problem hiding this comment.
You're right, somehow I thought SamplingDecision was in a different package. Removed it
| */ | ||
|
|
||
| import { ContextManager, TextMapPropagator } from '@opentelemetry/api'; | ||
| import { |
There was a problem hiding this comment.
nit: I think this could be import type, though this isn't something you changed.
| import { | |
| import type { |
| this._spanProcessor = spanProcessor; | ||
| this._tracerMetrics = new TracerMetrics( | ||
| localConfig.meterProvider || api.metrics.getMeterProvider() | ||
| ); |
There was a problem hiding this comment.
@anuraaga Am I correct in understanding that this default to the global MeterProvider differs from the Java impl? https://github.com/open-telemetry/opentelemetry-java/pull/7895/files#diff-918636d01c04fc2f4763f5564a621b35a919e1d80b5b9be3b4d9daa2649a96c3R42 looks like Java defaults to a Noop MeterProvider.
I was going to start discussing whether these "development"-status metrics should be on by default in this ("stable") package. My understanding is that OTel is still trying to figure things out around what things should be on by default and whether and how to opt-in to things.
@open-telemetry/javascript-maintainers I'd appreciate other opinions here.
FWIW, OTel Java has a otel.instrumentation.http.client.emit-experimental-telemetry boolean config item. I don't see many similar config items, so I gather things are still being felt out.
Once we sort out "on/off by default", presumably sdk-node/src/sdk.ts should be updated to pass in the just-created meterProvider.
This might mean re-ordering creation of providers in NodeSDK.start(), not sure.
There was a problem hiding this comment.
Thanks, as the spec for SDK metrics doesn't mention opt-in at all, I've been going under the assumption that they are emitted by default, and views can be used to delete them. For better or worse, the "legacy" metrics in the Java SDK have behaved that way as far as precedent goes.
But either way, in terms of initializing a tracer, I agree that noop looks better here, and matches what we do in Java, where it's autoconfigure that wires things up. So NodeSDK in opentelemetry-js terms, tweaked to that
|
@anuraaga I had a small commit to add a src/semconv.ts and change TracerMetrics.ts to use some consts from it: import {
ATTR_OTEL_SPAN_PARENT_ORIGIN,
ATTR_OTEL_SPAN_SAMPLING_RESULT,
METRIC_OTEL_SDK_SPAN_LIVE,
METRIC_OTEL_SDK_SPAN_STARTED,
} from './semconv';
...but I think your fork settings don't allow pushes from others. The diff is here if you'd like: https://gist.github.com/trentm/fd0f0cdf90222165d8335bc8888d3578 |
There was a problem hiding this comment.
but I think your fork settings don't allow pushes from others.
Thanks, not sure what happened there since I think it should have been allowed. Maybe it doesn't work when there's a merge conflict on the PR with that mysterious error message
Casual hint, I always use gh CLI like gh pr checkout 6213 which automatically sets the tracking branch to the fork without an additional remote, and git push pushes to it. Not sure if it would configure the tracking branch in a better way than manually but either way, I appreciate the convenience
| this._spanProcessor = spanProcessor; | ||
| this._tracerMetrics = new TracerMetrics( | ||
| localConfig.meterProvider || api.metrics.getMeterProvider() | ||
| ); |
There was a problem hiding this comment.
Thanks, as the spec for SDK metrics doesn't mention opt-in at all, I've been going under the assumption that they are emitted by default, and views can be used to delete them. For better or worse, the "legacy" metrics in the Java SDK have behaved that way as far as precedent goes.
But either way, in terms of initializing a tracer, I agree that noop looks better here, and matches what we do in Java, where it's autoconfigure that wires things up. So NodeSDK in opentelemetry-js terms, tweaked to that
| case SamplingDecision.NOT_RECORD: | ||
| return 'DROP'; | ||
| default: | ||
| return 'unknown'; |
There was a problem hiding this comment.
You're right, somehow I thought SamplingDecision was in a different package. Removed it
| const tracerProvider = setGlobalTracerProviderSpy.lastCall.args[0]; | ||
| assert.ok(tracerProvider instanceof NodeTracerProvider); | ||
| assert.ok( | ||
| (tracerProvider as any)._config.meterProvider instanceof MeterProvider |
There was a problem hiding this comment.
I know this isn't great but couldn't think of anything better. Let me know if you have any ideas
| setGlobalTracerProviderSpy = Sinon.spy(trace, 'setGlobalTracerProvider'); | ||
| setGlobalLoggerProviderSpy = Sinon.spy(logs, 'setGlobalLoggerProvider'); | ||
|
|
||
| // need to set these to none, since the default value is 'otlp'. Tests either |
There was a problem hiding this comment.
Wiring up metrics to tracer caused some impact on unrelated tests. This seemed like a reasonable default, but let me know how it looks
There was a problem hiding this comment.
Yup, I think this is reasonable. I think these process.env settings can bleed out to other running .test.ts files in the same package, because mocha runs all the tests in the same process. However, that was already the case here.
trentm
left a comment
There was a problem hiding this comment.
This LGTM. One nit below.
@pichlermarc Curious for your opinion. This PR adds a starter set of two OTel SDK metrics (per https://opentelemetry.io/docs/specs/semconv/otel/sdk-metrics/). This spec is in "Development". If a BasicTracerProvider is given a meterProvider -- a new property on TracerConfig -- then it will generate the metrics. It defaults to a NoopMeter. NodeSDK has been updated to pass in its meterProvider, if any, so by default with the NodeSDK we'll get these two new metrics.
Did you have advice on:
- if/whether things should be merged
@experimentalwith jsdoc tags - if NodeSDK should require some opt-in for development things. (My naive understanding is that "stable-only by default" is still up in the air, and a long way off.)
| setGlobalTracerProviderSpy = Sinon.spy(trace, 'setGlobalTracerProvider'); | ||
| setGlobalLoggerProviderSpy = Sinon.spy(logs, 'setGlobalLoggerProvider'); | ||
|
|
||
| // need to set these to none, since the default value is 'otlp'. Tests either |
There was a problem hiding this comment.
Yup, I think this is reasonable. I think these process.env settings can bleed out to other running .test.ts files in the same package, because mocha runs all the tests in the same process. However, that was already the case here.
| this.instrumentationScope = instrumentationScope; | ||
|
|
||
| const meter = localConfig.meterProvider | ||
| ? localConfig.meterProvider.getMeter('@opentelemetry/sdk-trace') |
There was a problem hiding this comment.
Perhaps this change to add a version to that Meter's instrumentationScope:
diff --git a/packages/opentelemetry-sdk-trace-base/src/Tracer.ts b/packages/opentelemetry-sdk-trace-base/src/Tracer.ts
index 2694f872f..eb84fd205 100644
--- a/packages/opentelemetry-sdk-trace-base/src/Tracer.ts
+++ b/packages/opentelemetry-sdk-trace-base/src/Tracer.ts
@@ -29,6 +29,7 @@ import { IdGenerator } from './IdGenerator';
import { RandomIdGenerator } from './platform';
import { Resource } from '@opentelemetry/resources';
import { TracerMetrics } from './TracerMetrics';
+import { VERSION } from './version';
/**
* This class represents a basic tracer.
@@ -63,7 +64,7 @@ export class Tracer implements api.Tracer {
this.instrumentationScope = instrumentationScope;
const meter = localConfig.meterProvider
- ? localConfig.meterProvider.getMeter('@opentelemetry/sdk-trace')
+ ? localConfig.meterProvider.getMeter('@opentelemetry/sdk-trace', VERSION)
: api.createNoopMeter();
this._tracerMetrics = new TracerMetrics(meter);
}
Instrumentations typically pass in their package version to be used for instrumentationScope.version for their tracer/meter/logger.
I think we should always annotate everything that is experimental in the spec with For I just quickly looked over the changes, so I'll not explicitly approve, but the overall approach LGTM. I'll leave the approving/merging to you once you're ready, @trentm. 🙂 Edit: |
Sounds good. I used @anuraaga Here is a proposed patch for this: https://gist.github.com/trentm/aa05491ff20a6e6178863dbc09e02cc1 |
|
Thanks, added the knob |
Which problem is this PR solving?
I am helping to implement SDK internal metrics
https://opentelemetry.io/docs/specs/semconv/otel/sdk-metrics/
I have started with the most basic of them, started and ended spans. This sets up some test infrastructure as well and will be good to confirm before implementing more.
For reference, similar change in Java: open-telemetry/opentelemetry-java#7895
Short description of the changes
Implements metrics for started and live spans per the semconv
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
Checklist:
/cc @trentm