Skip to content

refactor!: Pre-release Refactor#5

Merged
jshlbrd merged 62 commits intomainfrom
jshlbrd/refactor
Jun 17, 2022
Merged

refactor!: Pre-release Refactor#5
jshlbrd merged 62 commits intomainfrom
jshlbrd/refactor

Conversation

@jshlbrd
Copy link
Contributor

@jshlbrd jshlbrd commented Jun 17, 2022

Description

This PR is a total refactor of the core application and is planned to be our only breaking refactor before moving to version 1.0. Along with the app changes comes better documentation, clearer Terraform and Jsonnet configurations, and improvements in error chaining.

Major changes include:

  • Replacing the Channeler interface for a Slicer interface
    • Instead of passing channels of data between processors, the app now passes slices of data
    • In testing we saw an up to 20% improvement in CPU performance with this change with little impact to memory performance
    • Users who want to mimic the Channeler functionality can recreate it by reading the output of a Slicer into a channel
  • Supporting raw data processing
    • Instead of treating all data as JSON, users can now send raw data through the system (see inspector and processor documentation for more detail)
  • Standardized processor inputs
    • All processors now accept a standard input (InputKey) instead of supporting per-processor inputs
    • For processors that used to accept multiple inputs, those inputs should now be copied into a single array before being processed
    • For the DynamoDB and Lambda processors, those inputs should now be copied into a single map (e.g., JSON object)
  • Zip processor is now named Group
  • Kinesis shard redistribution
    • The Kinesis sink now supports user-defined data redistribution across shards (see documentation for more detail)
  • Error reporting
    • Errors now use error chaining, this makes it easier to debug data processing failures
  • Benchmarks
    • All inspectors and processors have benchmark coverage for all tests

Motivation and Context

We anticipated doing a total refactor at some point before the first major release, this is that PR.

How Has This Been Tested?

  • All unit tests pass
  • We've been running these changes in production on some of our busiest data pipelines for months

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.

@jshlbrd jshlbrd marked this pull request as ready for review June 17, 2022 15:41
@jshlbrd jshlbrd requested a review from a team as a code owner June 17, 2022 15:41
@jshlbrd jshlbrd merged commit c89ced4 into main Jun 17, 2022
@jshlbrd jshlbrd deleted the jshlbrd/refactor branch June 17, 2022 15:43
@jshlbrd jshlbrd removed the request for review from a team June 17, 2022 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant