Skip to content

feat(sdk-trace): implement span start/end metrics#6213

Merged
trentm merged 18 commits intoopen-telemetry:mainfrom
anuraaga:tracer-metrics
Feb 20, 2026
Merged

feat(sdk-trace): implement span start/end metrics#6213
trentm merged 18 commits intoopen-telemetry:mainfrom
anuraaga:tracer-metrics

Conversation

@anuraaga
Copy link
Contributor

@anuraaga anuraaga commented Dec 12, 2025

Which problem is this PR solving?

I am helping to implement SDK internal metrics

https://opentelemetry.io/docs/specs/semconv/otel/sdk-metrics/

I have started with the most basic of them, started and ended spans. This sets up some test infrastructure as well and will be good to confirm before implementing more.

For reference, similar change in Java: open-telemetry/opentelemetry-java#7895

Short description of the changes

Implements metrics for started and live spans per the semconv

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Unit tests

Checklist:

  • Followed the style guidelines of this project
  • Unit tests have been added
  • Documentation has been updated

/cc @trentm

@anuraaga anuraaga requested a review from a team as a code owner December 12, 2025 04:50
@codecov
Copy link

codecov bot commented Dec 12, 2025

Codecov Report

❌ Patch coverage is 98.11321% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 95.52%. Comparing base (9a44ea7) to head (feb1621).
⚠️ Report is 31 commits behind head on main.

Files with missing lines Patch % Lines
...imental/packages/opentelemetry-sdk-node/src/sdk.ts 90.90% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6213      +/-   ##
==========================================
+ Coverage   95.48%   95.52%   +0.03%     
==========================================
  Files         363      367       +4     
  Lines       11563    11661      +98     
  Branches     2669     2690      +21     
==========================================
+ Hits        11041    11139      +98     
  Misses        522      522              
Files with missing lines Coverage Δ
packages/opentelemetry-sdk-trace-base/src/Span.ts 97.77% <100.00%> (+0.02%) ⬆️
...ackages/opentelemetry-sdk-trace-base/src/Tracer.ts 98.71% <100.00%> (+0.12%) ⬆️
.../opentelemetry-sdk-trace-base/src/TracerMetrics.ts 100.00% <100.00%> (ø)
...ckages/opentelemetry-sdk-trace-base/src/semconv.ts 100.00% <100.00%> (ø)
...imental/packages/opentelemetry-sdk-node/src/sdk.ts 95.90% <90.90%> (+0.04%) ⬆️

... and 7 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@trentm trentm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! First pass.

api/CHANGELOG.md Outdated

* feat(api): improve isValidSpanId, isValidTraceId performance [#5714](https://github.com/open-telemetry/opentelemetry-js/pull/5714) @seemk
* feat(diag): change types in `DiagComponentLogger` from `any` to `unknown`[#5478](https://github.com/open-telemetry/opentelemetry-js/pull/5478) @loganrosen
* feat(sdk-trace): implement span start/end metrics [#1851](https://github.com/open-telemetry/opentelemetry-js/pull/6213) @anuraaga
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be added to the top-dir CHANGELOG.md. This file is for changes to the api/... dir.

private readonly liveSpans: UpDownCounter;

constructor(meterProvider: MeterProvider) {
const meter = meterProvider.getMeter('@opentelemetry/sdk-trace');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This string becomes the InstrumentationScope.name. Typically our instrumentation uses the JS package name for this, which in this case has the "-base" suffix. I can understand wanting to use "sdk-trace" without the "-base":

  • the equivalents for metrics and logs are in the .../sdk-metrics and .../sdk-logs packages (no "-base" on them)

I'm fine with using this name.
@pichlermarc Do you have a particular opinion on this?

case SamplingDecision.NOT_RECORD:
return 'DROP';
default:
return 'unknown';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can a SamplingDecision be any other value? I.e. is the "unknown" fallback necessary? /shrug

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, somehow I thought SamplingDecision was in a different package. Removed it

*/

import { ContextManager, TextMapPropagator } from '@opentelemetry/api';
import {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think this could be import type, though this isn't something you changed.

Suggested change
import {
import type {

this._spanProcessor = spanProcessor;
this._tracerMetrics = new TracerMetrics(
localConfig.meterProvider || api.metrics.getMeterProvider()
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anuraaga Am I correct in understanding that this default to the global MeterProvider differs from the Java impl? https://github.com/open-telemetry/opentelemetry-java/pull/7895/files#diff-918636d01c04fc2f4763f5564a621b35a919e1d80b5b9be3b4d9daa2649a96c3R42 looks like Java defaults to a Noop MeterProvider.

I was going to start discussing whether these "development"-status metrics should be on by default in this ("stable") package. My understanding is that OTel is still trying to figure things out around what things should be on by default and whether and how to opt-in to things.

@open-telemetry/javascript-maintainers I'd appreciate other opinions here.
FWIW, OTel Java has a otel.instrumentation.http.client.emit-experimental-telemetry boolean config item. I don't see many similar config items, so I gather things are still being felt out.

Once we sort out "on/off by default", presumably sdk-node/src/sdk.ts should be updated to pass in the just-created meterProvider.
This might mean re-ordering creation of providers in NodeSDK.start(), not sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, as the spec for SDK metrics doesn't mention opt-in at all, I've been going under the assumption that they are emitted by default, and views can be used to delete them. For better or worse, the "legacy" metrics in the Java SDK have behaved that way as far as precedent goes.

But either way, in terms of initializing a tracer, I agree that noop looks better here, and matches what we do in Java, where it's autoconfigure that wires things up. So NodeSDK in opentelemetry-js terms, tweaked to that

@trentm
Copy link
Contributor

trentm commented Jan 30, 2026

@anuraaga I had a small commit to add a src/semconv.ts and change TracerMetrics.ts to use some consts from it:

import {
  ATTR_OTEL_SPAN_PARENT_ORIGIN,
  ATTR_OTEL_SPAN_SAMPLING_RESULT,
  METRIC_OTEL_SDK_SPAN_LIVE,
  METRIC_OTEL_SDK_SPAN_STARTED,
} from './semconv';

...

but I think your fork settings don't allow pushes from others.

% git push anuraaga tracer-metrics
ERROR: Permission to anuraaga/opentelemetry-js.git denied to trentm.
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

The diff is here if you'd like: https://gist.github.com/trentm/fd0f0cdf90222165d8335bc8888d3578

Copy link
Contributor Author

@anuraaga anuraaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I think your fork settings don't allow pushes from others.

Thanks, not sure what happened there since I think it should have been allowed. Maybe it doesn't work when there's a merge conflict on the PR with that mysterious error message

Image

Casual hint, I always use gh CLI like gh pr checkout 6213 which automatically sets the tracking branch to the fork without an additional remote, and git push pushes to it. Not sure if it would configure the tracking branch in a better way than manually but either way, I appreciate the convenience

this._spanProcessor = spanProcessor;
this._tracerMetrics = new TracerMetrics(
localConfig.meterProvider || api.metrics.getMeterProvider()
);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, as the spec for SDK metrics doesn't mention opt-in at all, I've been going under the assumption that they are emitted by default, and views can be used to delete them. For better or worse, the "legacy" metrics in the Java SDK have behaved that way as far as precedent goes.

But either way, in terms of initializing a tracer, I agree that noop looks better here, and matches what we do in Java, where it's autoconfigure that wires things up. So NodeSDK in opentelemetry-js terms, tweaked to that

case SamplingDecision.NOT_RECORD:
return 'DROP';
default:
return 'unknown';
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, somehow I thought SamplingDecision was in a different package. Removed it

const tracerProvider = setGlobalTracerProviderSpy.lastCall.args[0];
assert.ok(tracerProvider instanceof NodeTracerProvider);
assert.ok(
(tracerProvider as any)._config.meterProvider instanceof MeterProvider
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this isn't great but couldn't think of anything better. Let me know if you have any ideas

setGlobalTracerProviderSpy = Sinon.spy(trace, 'setGlobalTracerProvider');
setGlobalLoggerProviderSpy = Sinon.spy(logs, 'setGlobalLoggerProvider');

// need to set these to none, since the default value is 'otlp'. Tests either
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wiring up metrics to tracer caused some impact on unrelated tests. This seemed like a reasonable default, but let me know how it looks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I think this is reasonable. I think these process.env settings can bleed out to other running .test.ts files in the same package, because mocha runs all the tests in the same process. However, that was already the case here.

Copy link
Contributor

@trentm trentm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM. One nit below.

@pichlermarc Curious for your opinion. This PR adds a starter set of two OTel SDK metrics (per https://opentelemetry.io/docs/specs/semconv/otel/sdk-metrics/). This spec is in "Development". If a BasicTracerProvider is given a meterProvider -- a new property on TracerConfig -- then it will generate the metrics. It defaults to a NoopMeter. NodeSDK has been updated to pass in its meterProvider, if any, so by default with the NodeSDK we'll get these two new metrics.

Did you have advice on:

  • if/whether things should be merged @experimental with jsdoc tags
  • if NodeSDK should require some opt-in for development things. (My naive understanding is that "stable-only by default" is still up in the air, and a long way off.)

setGlobalTracerProviderSpy = Sinon.spy(trace, 'setGlobalTracerProvider');
setGlobalLoggerProviderSpy = Sinon.spy(logs, 'setGlobalLoggerProvider');

// need to set these to none, since the default value is 'otlp'. Tests either
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I think this is reasonable. I think these process.env settings can bleed out to other running .test.ts files in the same package, because mocha runs all the tests in the same process. However, that was already the case here.

this.instrumentationScope = instrumentationScope;

const meter = localConfig.meterProvider
? localConfig.meterProvider.getMeter('@opentelemetry/sdk-trace')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this change to add a version to that Meter's instrumentationScope:

diff --git a/packages/opentelemetry-sdk-trace-base/src/Tracer.ts b/packages/opentelemetry-sdk-trace-base/src/Tracer.ts
index 2694f872f..eb84fd205 100644
--- a/packages/opentelemetry-sdk-trace-base/src/Tracer.ts
+++ b/packages/opentelemetry-sdk-trace-base/src/Tracer.ts
@@ -29,6 +29,7 @@ import { IdGenerator } from './IdGenerator';
 import { RandomIdGenerator } from './platform';
 import { Resource } from '@opentelemetry/resources';
 import { TracerMetrics } from './TracerMetrics';
+import { VERSION } from './version';

 /**
  * This class represents a basic tracer.
@@ -63,7 +64,7 @@ export class Tracer implements api.Tracer {
     this.instrumentationScope = instrumentationScope;

     const meter = localConfig.meterProvider
-      ? localConfig.meterProvider.getMeter('@opentelemetry/sdk-trace')
+      ? localConfig.meterProvider.getMeter('@opentelemetry/sdk-trace', VERSION)
       : api.createNoopMeter();
     this._tracerMetrics = new TracerMetrics(meter);
   }

Instrumentations typically pass in their package version to be used for instrumentationScope.version for their tracer/meter/logger.

@pichlermarc
Copy link
Member

pichlermarc commented Feb 12, 2026

This LGTM. One nit below.

@pichlermarc Curious for your opinion. This PR adds a starter set of two OTel SDK metrics (per https://opentelemetry.io/docs/specs/semconv/otel/sdk-metrics/). This spec is in "Development". If a BasicTracerProvider is given a meterProvider -- a new property on TracerConfig -- then it will generate the metrics. It defaults to a NoopMeter. NodeSDK has been updated to pass in its meterProvider, if any, so by default with the NodeSDK we'll get these two new metrics.

Did you have advice on:

* if/whether things should be merged `@experimental` with jsdoc tags

* if NodeSDK should require some opt-in for development things. (My naive understanding is that "stable-only by default" is still up in the air, and a long way off.)

I think we should always annotate everything that is experimental in the spec with @experimental and add a note that it may break in minor versions. Ideally, we should also make sure that a spec issue for stabilization exists so that there's a clear path towards actually getting the feature to stable.

For NodeSDK - I'm not sure. I think the best way to do it would be to disable it by default with an option an option to enable it. (OTEL_NODE_EXPERIMENTAL_SPAN_METRICS). Then push on spec side for more prototypes/stabilization so that we can get of the opt-in. It seems that there are three prototypes (Java, Python, and this PR) already, so I think stabilization is not too far off. Personally, I think having it as an opt-in feature provides some incentive to drive stabilization, which benefits the project as a whole.

I just quickly looked over the changes, so I'll not explicitly approve, but the overall approach LGTM. I'll leave the approving/merging to you once you're ready, @trentm. 🙂

Edit:
To summarize, I think @experimental is a must-have; opt-in for NodeSDK would be nice-to-have, but if it's not there, it's also not blocking for me.

@trentm
Copy link
Contributor

trentm commented Feb 12, 2026

I think the best way to do it would be to disable it by default with an option an option to enable it. (OTEL_NODE_EXPERIMENTAL_SPAN_METRICS).

Sounds good. I used OTEL_NODE_EXPERIMENTAL_SDK_METRICS because I think it is "SDK metrics" that is the important differentiator, but /shrug. We could have separate ones for SDK span/meter/logs metrics, but that seems like overkill.

@anuraaga Here is a proposed patch for this: https://gist.github.com/trentm/aa05491ff20a6e6178863dbc09e02cc1

@anuraaga
Copy link
Contributor Author

Thanks, added the knob

Copy link
Contributor

@trentm trentm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@trentm trentm added this pull request to the merge queue Feb 20, 2026
Merged via the queue into open-telemetry:main with commit 0c1c8f6 Feb 20, 2026
27 checks passed
@otelbot
Copy link
Contributor

otelbot bot commented Feb 20, 2026

Thank you for your contribution @anuraaga! 🎉 We would like to hear from you about your experience contributing to OpenTelemetry by taking a few minutes to fill out this survey.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants