Skip to content

feat: Customizable Kinesis Data Stream Autoscaling#27

Merged
jshlbrd merged 8 commits intomainfrom
jshlbrd/autoscaling-envvar
Sep 28, 2022
Merged

feat: Customizable Kinesis Data Stream Autoscaling#27
jshlbrd merged 8 commits intomainfrom
jshlbrd/autoscaling-envvar

Conversation

@jshlbrd
Copy link
Contributor

@jshlbrd jshlbrd commented Sep 19, 2022

Description

  • Adds environment variables for customizable Kinesis Data Stream (KDS) autoscaling behavior
  • Replaces AppConfig minimum and maximum shard settings with Kinesis tags
  • Fixes default Terraform settings for KDS datapoint alarms

Motivation and Context

We manage some KDS that require more aggressive autoscaling settings than others. The two stream behaviors we've observed are:

  • steady stream utilization, where streams increase and decrease at a steady rate hour to hour or day to day
  • bursty stream utilization, where streams increase and decrease suddenly minute to minute

The latter behavior requires customizable CloudWatch alarms, so to provide that this PR gives users the ability to override the number of datapoints that will trigger a scaling event.

There are several parameters that could be tuned, but focusing on datapoints keeps the configuration simple and still addresses the majority of scaling use cases (i.e., independently scale up or down quickly or slowly).

This also introduces the ability for users to deploy multiple autoscaling Lambda that address different scaling patterns. For example, 80% of data pipelines may use a "steady rate" autoscaling pattern and 20% of data pipelines may use a "burst rate" autoscaling pattern. This can all be managed via Terraform.

Included is a breaking change that removes support for the autoscaling configuration file and replaces it with Kinesis tags. Using tags reduces the amount of configurations that teams need to manage, and since the config was directly impacting Kinesis resources, I think it makes more sense to pair it with the resource in Terraform.

How Has This Been Tested?

Changes tested in production with dozens of data pipelines that have variable stream utilization -- some have very high volume, unpredictable traffic and we haven't observed any issues.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.

@jshlbrd jshlbrd marked this pull request as ready for review September 27, 2022 19:12
@jshlbrd jshlbrd requested a review from a team as a code owner September 27, 2022 19:12
@jshlbrd jshlbrd merged commit 2dd7ea7 into main Sep 28, 2022
@jshlbrd jshlbrd deleted the jshlbrd/autoscaling-envvar branch September 28, 2022 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants