S3 Event Decoding Consistency

**Is your feature request related to a problem? Please describe.**

The `s3` source includes two codes in 1.5 and a new codec for CSV processing is coming in 2.0. These populate Events somewhat differently.

* `newline-delimited` -> The newline is saved to the `message` key of the Event. This is a single string.
* `json` -> The JSON is expanded into `message`. So, if the JSON has a key named `sourceIp`, it is populated in `/message/sourceIp`.
* `csv` -> Each key is expanded directly into the root of the Event (`/`). Thus, if the CSV has a key named `sourceIp`, it is populated in `/sourceIp`.

Also, the `s3` processor adds two special keys to all Events: `bucket` and `key`. These indicate the S3 bucket and key, respectively, for the object. The S3 Processor populates this, not the Codecs.

**Describe the solution you'd like**

First, all codecs should put the data in the same place consistently. Second, we should decide where we want this data to reside (`/message` or `/`). Third, it should avoid conflicting with the `bucket` and `key`.

One possible solution is to change the `s3` source to save the `bucket` and `key` to a top-level object named `s3`. Then the codecs save to the root (`/`). This could lead to conflicts if the actual data has a column or field named `s3`. But, if we make this key configurable, then pipeline authors could potentially avoid this.

**Describe alternatives you've considered (Optional)**

An alternative would be more robust support for Event metadata. The bucket and key could be saved as metadata. However, Data Prepper's conditional routing and processors don't support Event metadata presently.

**Additional context**

* #251 
* #1081



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

S3 Event Decoding Consistency #1687

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

S3 Event Decoding Consistency #1687

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions