Conversation
Codecov Report
@@ Coverage Diff @@
## master #301 +/- ##
==========================================
+ Coverage 84.6% 84.73% +0.12%
==========================================
Files 33 35 +2
Lines 1676 1716 +40
Branches 199 200 +1
==========================================
+ Hits 1418 1454 +36
- Misses 201 204 +3
- Partials 57 58 +1
Continue to review full report at Codecov.
|
examples/opentelemetry-example-app/src/opentelemetry_example_app/flask_example.py ext/opentelemetry-ext-wsgi/src/opentelemetry/ext/wsgi/__init__.py ext/opentelemetry-ext-wsgi/tests/test_wsgi_middleware.py
Conflicts: ext/opentelemetry-ext-opentracing-shim/src/opentelemetry/ext/opentracing_shim/__init__.py ext/opentelemetry-ext-testutil/src/opentelemetry/ext/testutil/wsgitestutil.py ext/opentelemetry-ext-wsgi/src/opentelemetry/ext/wsgi/__init__.py opentelemetry-sdk/tests/trace/test_trace.py
| links: Sequence[trace_api.Link] = (), | ||
| kind: trace_api.SpanKind = trace_api.SpanKind.INTERNAL, | ||
| span_processor: SpanProcessor = SpanProcessor(), | ||
| creator_info: "InstrumentationInfo" = None, |
There was a problem hiding this comment.
Any reason not to call this argument instrumentation_info?
There was a problem hiding this comment.
Because it is the instrumentation_info that identifies the creator of this span, so I thought creator_info is a more specific name.
There was a problem hiding this comment.
I agree that instrumentation_info is the more obvious name here since it's called that on the tracer.
| ) -> "trace_api.Tracer": | ||
| if not instrumenting_library_name: # Reject empty strings too. | ||
| instrumenting_library_name = "ERROR:MISSING LIB NAME" | ||
| logger.error("get_tracer called with missing library name.") |
There was a problem hiding this comment.
Looking at the spec, it looks like the name should be optional? https://github.com/open-telemetry/opentelemetry-specification/pull/354/files#diff-ea4f4a46fe8755cf03f9fb3a6434cb0cR98
There was a problem hiding this comment.
There was a problem hiding this comment.
I think we ought to push back on this, the most common case is likely to be the default tracer, which doesn't need a name.
There was a problem hiding this comment.
Why do you think that? I think the most common case will be tracers created by instrumentations, which should definitely tell their name to OpenTelemetry. Even when you have no dedicated instrumentation library, you can use the name and version of your application here.
There was a problem hiding this comment.
(sharing an extra opinion) I don't think it's a big deal. two extra arguments of boilerplate isn't the best, but it's not a usability blocker.
I personally imagine tracing will be transparent to most people as the integrations will take care of injecting traces into the right locations. If there are enough extensions, the need for custom traces will probably be fairly minimal.
Co-Authored-By: alrex <alrex.boten@gmail.com>
Co-Authored-By: alrex <alrex.boten@gmail.com>
Conflicts: opentelemetry-sdk/src/opentelemetry/sdk/trace/__init__.py
c24t
left a comment
There was a problem hiding this comment.
This looks like a faithful implementation of the spec, and follows open-telemetry/opentelemetry-java#584. In that sense it LGTM.
But these changes are also make the API significantly more complicated, and make our already user-unfriendly boilerplate even more unfriendly.
Compare our API:
trace.set_preferred_tracer_source_implementation(lambda T: TracerSource())
tracer = trace.tracer_source().get_tracer(name, version)to logging:
logger = logging.getLogger(name)There are also some significant unanswered questions about the relationship between tracers and context that need to be addressed in the spec.
|
|
||
| tracer = trace.tracer() | ||
| tracer = trace.tracer_source().get_tracer( | ||
| "opentelemetry-ext-flask", __version__ |
There was a problem hiding this comment.
Out of curiosity: why use the package name instead of the dotted namespace? E.g. opentelemetry.ext.flask? The later is closer to the __file__ convention for logger names, and you can have multiple unique names per package.
There was a problem hiding this comment.
That would be another possibility. There should probably a unique mapping full module name -> package anyway.
There was a problem hiding this comment.
+1 on the suggestion to use the name. Most loggers are configured that way as well, and having parity of the logging namespace and tracer namespace could be very useful.
There was a problem hiding this comment.
I changed it to the module name in e4db217. But note that once we have a registry pattern, we will be creating a tracer per module instead of per library with the most naive implementation at least.
There was a problem hiding this comment.
Yeah, one risk of copying logging here is that users might reasonably assume these names are meant to be hierarchical too.
| # TODO: How should multiple TracerSources behave? Should they get their own contexts? | ||
| # This could be done by adding `str(id(self))` to the slot name. |
There was a problem hiding this comment.
Making TracerSource the shared container the tracer config (sampler, processor, etc.) and the context seems like it would create some problems.
What would you do if you want a particular tracer to share context with the default tracer, but configure it to use a different span processor? Silencing noisy tracers from instrumented libraries is one of the main motivations for https://github.com/open-telemetry/oteps/blob/master/text/0016-named-tracers.md, and the span processor is only mechanism the API provides to suppress span creation.
FWIW, it looks like the java client doesn't allow users to configure the context at all -- all TracerSdks use the global context. The only way to use a different context is to bring your own tracer impl.
I think there's a good argument for making the context configurable on a per-tracer or -factory basis, and the spec doesn't adequately explain the relationship between tracers and the context now.
There was a problem hiding this comment.
Silencing noisy tracers from instrumented libraries is one of the main motivations for https://github.com/open-telemetry/oteps/blob/master/text/0016-named-tracers.md, and the span processor is only mechanism the API provides to suppress span creation.
I think I should actually add the instrumentation info to to the arguments for the sampler. What do you think? EDIT: On second thought, this is a breaking change for the sampler API and we might want to gather all the sampler arguments into some SpanArguments object/namedtuple to not break the API again in the future with such a change. I think that could be done in a separate PR to not drag this one out longer.
There was a problem hiding this comment.
Making TracerSource the shared container the tracer config (sampler, processor, etc.) and the context seems like it would create some problems.
But that's what we decided on in the spec, after some discussion: open-telemetry/opentelemetry-specification#304 (and open-telemetry/opentelemetry-specification#339) tl;dr: While this may be a problem worth solving, "named tracers" was never meant as a solution for that problem.
There was a problem hiding this comment.
I would also argue that sharing the same trace configuration is generally the preferred thing to do.
Potentially there's value in adding additional processors per tracer, thus chaining the processing. But the vendor tracing integrations I've seen sample and process globally anyway.
| if not instrumenting_library_name: # Reject empty strings too. | ||
| instrumenting_library_name = "ERROR:MISSING LIB NAME" | ||
| logger.error("get_tracer called with missing library name.") | ||
| return Tracer( |
There was a problem hiding this comment.
Is there a reason not to cache the tracers? This class seems like a natural registry.
There was a problem hiding this comment.
What would be the benefit of caching them? Creating the named tracers instance should be very cheap.
There was a problem hiding this comment.
I could imagine it's more objects the GC has to collect aggressively. I think we should drive these types of requirements through benchmark tests in the repo. Ultimately the concern is performance so a way to quantify the ramifications of design choices would be great.
Tracking ticket for that suggestion: #335
| links: Sequence[trace_api.Link] = (), | ||
| kind: trace_api.SpanKind = trace_api.SpanKind.INTERNAL, | ||
| span_processor: SpanProcessor = SpanProcessor(), | ||
| creator_info: "InstrumentationInfo" = None, |
There was a problem hiding this comment.
I agree that instrumentation_info is the more obvious name here since it's called that on the tracer.
|
I also see some stray |
Thank you, I merged it. Maybe the permissions on the repository I'm using (it's in the dynatrace org) are overriding the "allow edits from maintainers" setting. |
c24t
left a comment
There was a problem hiding this comment.
Thanks for addressing my comments @Oberon00. I think there are still some issues to resolve in the spec, but I don't want to hold up this PR.
This should be ready to merge except for the merge conflict (which I can't resolve on this branch).
toumorokoshi
left a comment
There was a problem hiding this comment.
Functionally everything looks great!
My one concern is the interface. "TracerSource" seems less intuitive to me than TraceFactory. Although I understand that there is a distinction there, "Source" is not a design pattern I'm aware of. For me, "Factory" gives me a more immediate understanding of the purpose of that class.
|
|
||
| tracer = trace.tracer() | ||
| tracer = trace.tracer_source().get_tracer( | ||
| "opentelemetry-ext-flask", __version__ |
There was a problem hiding this comment.
+1 on the suggestion to use the name. Most loggers are configured that way as well, and having parity of the logging namespace and tracer namespace could be very useful.
| from opentelemetry import trace | ||
| from opentelemetry.context import Context | ||
| from opentelemetry.sdk.trace import Tracer | ||
| from opentelemetry.sdk.trace import TracerSource |
There was a problem hiding this comment.
another idea on naming: "tracing". Since it's conceptually similar to the logging module, we could use that interface convention:
tracing.get_tracer()
Although I'm also a fan of adding these into the top level trace interface, as that's simpler:
trace.get_tracer(__name__)
| trace.set_preferred_tracer_implementation(lambda T: Tracer()) | ||
| tracer = trace.tracer() | ||
| trace.set_preferred_tracer_source_implementation(lambda T: TracerSource()) | ||
| tracer = trace.tracer_source().get_tracer("myapp") |
There was a problem hiding this comment.
I worry if we put all the examples as get_tracer("myapp"), that might cause a lot of people to not understand the convention to use the package / module name in the tracer, and make integrations that are not part of this repo harder to turn on / off.
How about following Flask's example of passing in name? Another argument to just use module names, in my opinion.
| # TODO: How should multiple TracerSources behave? Should they get their own contexts? | ||
| # This could be done by adding `str(id(self))` to the slot name. |
There was a problem hiding this comment.
I would also argue that sharing the same trace configuration is generally the preferred thing to do.
Potentially there's value in adding additional processors per tracer, thus chaining the processing. But the vendor tracing integrations I've seen sample and process globally anyway.
| ) -> "trace_api.Tracer": | ||
| if not instrumenting_library_name: # Reject empty strings too. | ||
| instrumenting_library_name = "ERROR:MISSING LIB NAME" | ||
| logger.error("get_tracer called with missing library name.") |
There was a problem hiding this comment.
(sharing an extra opinion) I don't think it's a big deal. two extra arguments of boilerplate isn't the best, but it's not a usability blocker.
I personally imagine tracing will be transparent to most people as the integrations will take care of injecting traces into the right locations. If there are enough extensions, the need for custom traces will probably be fairly minimal.
| if not instrumenting_library_name: # Reject empty strings too. | ||
| instrumenting_library_name = "ERROR:MISSING LIB NAME" | ||
| logger.error("get_tracer called with missing library name.") | ||
| return Tracer( |
There was a problem hiding this comment.
I could imagine it's more objects the GC has to collect aggressively. I think we should drive these types of requirements through benchmark tests in the repo. Ultimately the concern is performance so a way to quantify the ramifications of design choices would be great.
Tracking ticket for that suggestion: #335
| def test_shutdown(self): | ||
| tracer = trace.Tracer() | ||
|
|
||
| memory_exporter = InMemorySpanExporter() |
There was a problem hiding this comment.
nice! good catch on some DRY-able code :)
|
@Oberon00 it looks like the build is segfaulting on the tracecontext step, due to a segfault. IIRC there was a fix with pinning the multidict version, which I'm having trouble finding. Is that the case here? |
|
The build fix is on master: #330, we just need to pick up master here. |
|
Just calling out that I changed this PR to use the module name instead of the library name, see #301 (comment). |
Implements the "Tracers" part of #203.
EDIT: I chose the name TracerSource instead of TracerFactory because it seems to be the consensus that TracerFactory is not an ideal name since the spec allows the same instance to be returned by different calls to getTracer, which means that TracerFactory is not really a Factory. See also open-telemetry/opentelemetry-specification#354 (comment)