perf(aws): Reduce Aggregated Kinesis Record Size#147
Merged
Conversation
internal/aws/kinesis/kinesis.go
Outdated
| kplMaxBytes = 1024 * 1024 | ||
| kplMagicLen = 4 // Length of magic header for KPL Aggregate Record checking. | ||
| kplDigestSize = 16 // MD5 Message size for protobuf. | ||
| kplMaxBytes = 1000 * 25 // 25KB is the minimum size of a PUT Payload unit. |
Contributor
There was a problem hiding this comment.
docs: the mismatch of variable name vs comment for maximum and minimum is a bit confusing, maybe some comments could clear up how this is used or we could change the name?
Note: this is non-blocking to me, I just spent some time looking at how the variable is used and the AWS restrictions b/c of it
Contributor
Author
There was a problem hiding this comment.
I think the comment can be improved -- the value is the maximum amount of bytes we want to allow in the aggregated (KPL) record because it is the minimum amount charged by the Kinesis service.
shellcromancer
approved these changes
Mar 19, 2024
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
PutRecordcall withPutRecordsMotivation and Context
Kinesis Data Streams charge based on PUT payload units, which have a minimum size of 25 KB. Historically the project aggregated up to the Kinesis limit of 1 MB because there was no support for the
PutRecordsAPI, but as of v1.0 there is. There's little (or no) value in sending aggregated records larger than 25 KB and by sending them viaPutRecordsthe throughput performance increases by ~33% to ~50% (see screenshot).How Has This Been Tested?
This was tested on a high-volume, production data pipeline.
Types of changes
Checklist: