Propose new StructuredBody field for logs#3014
Propose new StructuredBody field for logs#3014djaglowski wants to merge 1 commit intoopen-telemetry:mainfrom
Conversation
| resource | MonitoredResource | The monitored resource that produced this log entry. | Resource | ||
| log_name | string | The URL-encoded LOG_ID suffix of the log_name field identifies which log stream this entry belongs to. | Attributes["gcp.log_name"] | ||
| json_payload | google.protobuf.Struct | The log entry payload, represented as a structure that is expressed as a JSON object. | Body | ||
| json_payload | google.protobuf.Struct | The log entry payload, represented as a structure that is expressed as a JSON object. | StructuredBody |
There was a problem hiding this comment.
This does roughly align but may need some more thought from the Google side to line up properly; if there is a Body with the message, that message should likely go into jsonPayload.message. Not sure what the best way to codify that here would be.
There was a problem hiding this comment.
Also, how does this reconcile with current users who may be parsing structured data to body? If body is structured, that will still go to json_payload too, or do we start flattening anything that's in body? This could be an implementation detail
|
I am not sure I understand how is |
We've codified in the spec that On the other hand,
|
If we think it is necessary to capture structured data I think it is more preferable to remove the limitation that the Body should be a string. I think adding another field complicates the data model and is confusing. |
I mostly agree, but his would be a breaking change, right? I think what I've proposed avoids that, at least in the data model. One additional benefit is that having both |
We use a SHOULD clause in the data model. I see no problem with adding a list of exceptions to this SHOULD clause and saying that "in these cases you are allowed to use structure data in the Body". We already say "However, a structured body may be necessary to preserve the semantics of some existing log formats". Any number of similar exceptions can be added and it won't be a breaking change. |
|
This was discussed in the Log SIG today. It was decided that we should add clarification to the log data model's description of the Body, to the effect that structured logs emitted by third-party applications SHOULD use the Body for the structured data. |
|
We're very early adopters of OTel Logs, and we've been capturing JSON logs of our applications and parsing the strings into KeyValueList in the AnyValue (turns out you can decode JSON very in the KVLs). Granted, we have our own custom exporters for Google and DataDog logs that turn them into the appropriate format. With StructuredBody, it becomes quite confusing... What do I do when we have KVLs, and what if I decided to structure my logs as Protobuf messages with the current body? It's simple: I add it add a Proto AnyType in the bytes field. I would argue that the Protobuf message is more structured than the StructuredBody. |
See decision by Log SIG noted in [#3014](#3014 (comment))
See decision by Log SIG noted in [open-telemetry#3014](open-telemetry/opentelemetry-specification#3014 (comment))
Motivation
The log data model has already been declared stable, yet an important decision made by the Logs SIG has not been codified in the specification. Specifically, it was the intention of the Logs SIG that
Attributesshould be the appropriate field for representing structured log data. Several proposals to codify this notion in the spec have stalled out.This proposal attempts to identify an alternative that would provide an explicit home for structured data, without breaking the data model.
Obviously this change may have implications for the SDK, Collector, etc, but I am suggesting that we fully explore this route in case it leads to a broadly acceptable solution.
Changes
The proposal is to add an optional new field, tentatively called
StructuredBody. This field would be dedicated to structured log data.Attributeswould still be intended for information about the log (e.g. user specified values, or semantic conventions)Bodywould remain unchanged as well. Notably the indication that "First-party Applications SHOULD use a string message." would still be valid.StructuredBodywould also be an alternative to the "data" field proposed in #2926.Related issues
BodyandAttributes#1613Related OTEP(s)