Skip to content

Add SDK span processors spec#205

Merged
yurishkuro merged 12 commits intoopen-telemetry:masterfrom
pavolloffay:sdk-span-processors
Sep 20, 2019
Merged

Add SDK span processors spec#205
yurishkuro merged 12 commits intoopen-telemetry:masterfrom
pavolloffay:sdk-span-processors

Conversation

@pavolloffay
Copy link
Copy Markdown
Member

@pavolloffay pavolloffay commented Aug 7, 2019

Resolves #155
Resolves #54

Block when the queue is full processor will be added in a separate PR.

Signed-off-by: Pavol Loffay ploffay@redhat.com

@@ -0,0 +1,61 @@
# Span processor
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to register a Span processor?
Do we allow multiple processors? (is there any guarantee on the ordering).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could guarantee the order. I have added this:

Span processors can be registered via a method on SDK Tracer. The registered processors are invoked in the same order as they were registered.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pavolloffay for answering the 2nd question.
Regarding the 1st question "How to register a Span processor", do you intend to cover it in a separate PR for the Tracer API change? Do we expect both register and unregister (or only allow to register during the Tracer initialization/ctor)?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently I implemented this with only Register and can be done at any moment.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have already documented the registration step. Here is the copy but please comment on the code if something should be fixed.

Span processors can be registered directly on SDK Tracer. The processors are invoked in the same order as they were registered.


* `exporter` - the exporter where the spans are pushed.
* `maxQueueSize` - the maximum size of the queue. After the size is reached spans are dropped.
* `scheduledDelayMilllis` - the delay interval between two consecutive exports.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is exportInterval, actualDelayMilllis = max(0, exportInterval - timeTakenToExport).
Wish to discuss here and see which one is preferred.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One scenario is "I want to export traces/logs/metrics every 5 seconds" and "I don't expect to export data every 7 seconds due to the fact that export operation itself is using 2 seconds in environment XYZ".

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One scenario is "I want to export traces/logs/metrics every 5 seconds" and "I don't expect to export data every 7 seconds due to the fact that export operation itself is using 2 seconds in environment XYZ".

The export should not be called concurrently, this is from exporter spec:

Export() will never be called concurrently for the same exporter instance. Export() can be called again only after the current call returns.


**Configurable parameters:**

* `exporter` - the exporter where the spans are pushed.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to register multiple exporters?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least in Java there is MultiSpanExporter exporter implementation which exports data to multiple exporters.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bogdandrutu shall we add MultiSpanExporter to the spec? I think it's an implementation detail.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, let's not be too restrictive about the implementation.


## Interface definition

### OnStart(SpanData)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why SpanData?

Copy link
Copy Markdown
Member Author

@pavolloffay pavolloffay Aug 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main point it to make clear that passed object exposes data of the Span.

tracing-api doc references it https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/api-tracing.md#spandata. It's a readable span object.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The span processors are invoked only on sampled spans.

Although this is what the exporters defined by the SDK will do, I don't feel is really a requirement - after all, sampling exists as a hint (as of now).

**Configurable parameters:**

* `exporter` - the exporter where the spans are pushed.
* `maxQueueSize` - the maximum size of the queue. After the size is reached spans are dropped.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any default values for these params?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to add those here?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like the wrong level of abstraction for the spec. Most other defaults and constants aren't documented here unless they're defined in other places, e.g. the W3C header length limits.


### Simple processor

The implementation of `SpanProcessor` that passes ended span directly to the configured `SpanExporter`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the relationship between processors and exporters? It seems like we're using processors to (1) expose span start and end hooks to vendors and (2) batch spans to send to the exporter. Why use the same component for both?

If processors can't modify the spans that they pass to the exporters, why not call the processors and exporters in parallel?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If processors can't modify the spans that they pass to the exporters, why not call the processors and exporters in parallel?

From exporter spec:

Allow implementing helpers as composable components that use the same chainable Exporter interface. SDK authors are encouraged to implement common functionality such as queuing, batching, tagging, etc. as helpers. This functionality will be applicable regardless of what protocol exporter is used.

The processor can be considered as a helper for the exporter. Perhaps we should document this in the first section. The exporter should only export and do not batch.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my mind the model was this:

SDK -> SpanProcessor (an implementation that batches all Spans) -> SpanExporter(List<SpanData/SpanProto>).

SpanExporter is one of the functionality that we can implement with the SpanProcessor, but for example some implementation may implement a SpanProcessor that adds extra attributes onStart.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bogdandrutu Maybe it would be worth adding a note here, to mention that (custom) SpanProcessors can actually modify the received Span?

The implementation of the `SpanProcessor` that batches ended spans and pushes them to the configured `SpanExporter`.

First the spans are added to a synchronized queue, then exported to the exporter pipeline in batches.
When the queue gets half full a preemptive notification is sent to the worker thread to wake up and start a new export cycle.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean we export faster than the scheduled interval when the queue is half full?

In any case this sounds to me like documentation for the java implementation. What about something like "The implementation is responsible for managing the span queue and sending batches of spans to the exporters."?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this text is originally from javadoc. We can make it more generic and give the implementors more freedom how the queue is managed and when it is flushed.


**Parameters:**

* `SpanData` - a readable span object.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if it's kosher to change the spec to suit pending RFCs, but open-telemetry/oteps#8 suggests this might have to be changed to "span" soon.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

open-telemetry/oteps#8 It talks about removing it from the API, it might happen that it will be moved to SDK spec no?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in that case it sounds like we shouldn't reference it from the spec.

Copy link
Copy Markdown
Member

@bg451 bg451 Aug 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something like SpanData can exists at the SDK level, there needs to be some sort of common concrete type that exporters can use. Having it at the API though added a more surface area and complexity though.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this moment (in Java) we actually get a low-level interface that returns the (read-only) protobuf representation of a Span. Wondering if that is, well, low-level ;)

@bg451
Copy link
Copy Markdown
Member

bg451 commented Aug 8, 2019

We recently discussed batched span processors in the javascript sig and some concerns came up. Batching can have a lot of parameters (backoff, retry, dropping, etc.), so our worry is that trying to provide an out of the box batching solution won't work for some vendors, requiring the user to do more upfront, vendor specific configuration. We decided in our sig meeting to send spans to all exporters on end, and span processors are a place where users/vendors can hook into for extra sauce. I personally don't mind making vendors write more code if it means users will be able to write less.

I'd really like to learn more about the thought process and previous discussion since #155 doesn't have any information.

@pavolloffay
Copy link
Copy Markdown
Member Author

The (backoff, retry, dropping, etc.) seems more related to the retry mechanism than the batching. I think the point here is to document something generic enough which will work across most languages.

I'd really like to learn more about the thought process and previous discussion since #155 doesn't have any information.

+1 perhaps @bogdandrutu or @SergeyKanzhelev could provide some details of what was discussed before.

@carlosalberto
Copy link
Copy Markdown
Contributor

We decided in our sig meeting to send spans to all exporters on end, and span processors are a place where users/vendors can hook into for extra sauce.

That makes me wonder whether different languages might actually need to provide different strategies here, depending on the thread/execution model...

@tedsuo
Copy link
Copy Markdown
Contributor

tedsuo commented Aug 13, 2019

I know that we are trying to document the current SDK behavior. But I really think we need to do more work to define the requirements for the SDK. It's hard to judge a design proposal such as this, without understanding the intended use cases.

Copy link
Copy Markdown
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it looks good, just some more thoughts that I have about this.


## Interface definition

### OnStart(SpanData)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to allow performance improvements and lazy initialized the SpanData I would suggest we go with a ReadableSpan which may expose some properties directly and a toSpanData or toSpanProto. The reason is that you don't want to construct that object all the time unless someone really needs it. Maybe a LazySpanData or something similar achieves the same thing.

Other fields that I have in mind may be information like isOutOfBand added in open-telemetry/oteps#8. So I think we need to expose an interface here that allows us to also expose other data.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I heard that some implementation of the SpanProcessor may want to add attributes to the Span itself or events. So it may be interesting to here if the Span itself should be exposed (maybe the SdkSpan that can have extra capabilities).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't use ReadableSpan because it is not defined anywhere, whereas SpanData is defined in tracing spec.

I will rename SpanData to just Span. The comment in the parameters will says:

  • Span - a readable span object.


### Simple processor

The implementation of `SpanProcessor` that passes ended span directly to the configured `SpanExporter`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my mind the model was this:

SDK -> SpanProcessor (an implementation that batches all Spans) -> SpanExporter(List<SpanData/SpanProto>).

SpanExporter is one of the functionality that we can implement with the SpanProcessor, but for example some implementation may implement a SpanProcessor that adds extra attributes onStart.


First the spans are added to a synchronized queue, then exported to the exporter pipeline in batches.
The implementation is responsible for managing the span queue and sending batches of spans to the exporters.
This processor can cause high contention in a very high traffic service.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would like to understand more about high contention. Is this because of the queue insertion operation? There are lock-free queue implementations which doesn't have high contention even under heavy traffic.

Another question - do we expect the queue to guarantee FIFO?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even a lock-free queue can cause false sharing which is also very bad and hard to quantify.

batching does not require to guarantee fifo, and also can drop spans when high traffic (based on a config). For an example of this see https://github.com/open-telemetry/opentelemetry-java/blob/master/sdk/src/main/java/io/opentelemetry/sdk/trace/export/BatchSampledSpansProcessor.java

We also execute the export on a different thread.

Span processor is an interface which allows hooks for span start and end method invocations.
The span processors are invoked only on sampled spans. This interface can be used as a helper for span exporter to batch and convert spans see [sdk-exporter-spec](sdk-exporter.md).

Span processors can be registered directly on SDK Tracer. The processors are invoked in the same order as they were registered.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this interface defined?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would want to see something like Tracer.RegisterSpanProcessor covered in this PR.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that should belong to the tracer SDK spec, which we don't have at the moment.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have created #217

@pavolloffay
Copy link
Copy Markdown
Member Author

@bogdandrutu @reyang @c24t I have updated the PR. The only change was to remove SpanData. I think we should also agree whether to define default values for batching span processor. I think it makes sense to have the same values across languages.


**Configurable parameters:**

* `exporter` - the exporter where the spans are pushed.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, let's not be too restrictive about the implementation.

@@ -0,0 +1,80 @@
# Span processor

Span processor is an interface which allows hooks for span start and end method invocations.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe an answer:

(a) if I must make a copy of the data, it's a Processor
(b) if I can take a reference to the data, it's an Exporter

In my understanding, it's rather the other way round: A Span-processor gets a mutable reference to a single Span synchronously. A exporter on the other hand may get one or more read-only Spans (or some data transfer object to which the Span was converted) delayed, asynchronously or never (queue full).


**Parameters:**

* `Span` - a readable span object.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather like the Span processor to get a mutable reference to the same span object that is started/ended. I imagine a common use-case for span processors would be to attach context information like thread/coroutine ID, etc.
Although it is true that the Processor (at least in OnEnd) should also be able to read the data, which may or may not be possible with a sdk.Span object. Maybe we must specify an additional Span interface with getters?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we must specify an additional Span interface with getters?

The SpanData could be considered as a readable interface, however it is going to be removed #215

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we eliminated SpanData from the user-facing API. I could believe it's still meant as a part of the SDK processor API.

Signed-off-by: Pavol Loffay <ploffay@redhat.com>
Signed-off-by: Pavol Loffay <ploffay@redhat.com>
Signed-off-by: Pavol Loffay <ploffay@redhat.com>
Signed-off-by: Pavol Loffay <ploffay@redhat.com>
Signed-off-by: Pavol Loffay <ploffay@redhat.com>
Signed-off-by: Pavol Loffay <ploffay@redhat.com>
Signed-off-by: Pavol Loffay <ploffay@redhat.com>
Signed-off-by: Pavol Loffay <ploffay@redhat.com>
Signed-off-by: Pavol Loffay <ploffay@redhat.com>
Signed-off-by: Pavol Loffay <ploffay@redhat.com>
Signed-off-by: Pavol Loffay <ploffay@redhat.com>
@pavolloffay
Copy link
Copy Markdown
Member Author

Can anybody merge this API or pinpoint what should be improved or what is missing.

cc) @open-telemetry/specs-approvers @bogdandrutu @yurishkuro @SergeyKanzhelev

@yurishkuro yurishkuro merged commit 7d7c3aa into open-telemetry:master Sep 20, 2019
SergeyKanzhelev pushed a commit to SergeyKanzhelev/opentelemetry-specification that referenced this pull request Feb 18, 2020
* Add SDK span processors spec

Signed-off-by: Pavol Loffay <ploffay@redhat.com>

* Fixes

Signed-off-by: Pavol Loffay <ploffay@redhat.com>

* Update review comments

Signed-off-by: Pavol Loffay <ploffay@redhat.com>

* Small fixes

Signed-off-by: Pavol Loffay <ploffay@redhat.com>

* Remove span data

Signed-off-by: Pavol Loffay <ploffay@redhat.com>

* Remove sampled

Signed-off-by: Pavol Loffay <ploffay@redhat.com>

* Add is recording events

Signed-off-by: Pavol Loffay <ploffay@redhat.com>

* change style of the links

Signed-off-by: Pavol Loffay <ploffay@redhat.com>

* Add ascii diagram

Signed-off-by: Pavol Loffay <ploffay@redhat.com>

* Smaller improvements based on review comments

Signed-off-by: Pavol Loffay <ploffay@redhat.com>

* Fix typo

Signed-off-by: Pavol Loffay <ploffay@redhat.com>
TuckTuckFloof pushed a commit to TuckTuckFloof/opentelemetry-specification that referenced this pull request Oct 15, 2020
carlosalberto pushed a commit to carlosalberto/opentelemetry-specification that referenced this pull request Oct 21, 2024
carlosalberto pushed a commit to carlosalberto/opentelemetry-specification that referenced this pull request Oct 23, 2024
carlosalberto pushed a commit to carlosalberto/opentelemetry-specification that referenced this pull request Oct 31, 2024
* Add SDK span processors spec

Signed-off-by: Pavol Loffay <ploffay@redhat.com>

* Fixes

Signed-off-by: Pavol Loffay <ploffay@redhat.com>

* Update review comments

Signed-off-by: Pavol Loffay <ploffay@redhat.com>

* Small fixes

Signed-off-by: Pavol Loffay <ploffay@redhat.com>

* Remove span data

Signed-off-by: Pavol Loffay <ploffay@redhat.com>

* Remove sampled

Signed-off-by: Pavol Loffay <ploffay@redhat.com>

* Add is recording events

Signed-off-by: Pavol Loffay <ploffay@redhat.com>

* change style of the links

Signed-off-by: Pavol Loffay <ploffay@redhat.com>

* Add ascii diagram

Signed-off-by: Pavol Loffay <ploffay@redhat.com>

* Smaller improvements based on review comments

Signed-off-by: Pavol Loffay <ploffay@redhat.com>

* Fix typo

Signed-off-by: Pavol Loffay <ploffay@redhat.com>
schmikei pushed a commit to schmikei/opentelemetry-specification that referenced this pull request Apr 17, 2025
Co-authored-by: Joao Grassi <joao@joaograssi.com>
Co-authored-by: Trask Stalnaker <trask.stalnaker@gmail.com>
Co-authored-by: Armin Ruech <armin.ruech@dynatrace.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Describe built-in SpanProcessors SDK: document SpanProcessor API