Skip to content

Implement Kahan Summation algorithm#71339

Closed
zkscpqm wants to merge 2 commits intoelastic:masterfrom
zkscpqm:master
Closed

Implement Kahan Summation algorithm#71339
zkscpqm wants to merge 2 commits intoelastic:masterfrom
zkscpqm:master

Conversation

@zkscpqm
Copy link
Copy Markdown

@zkscpqm zkscpqm commented Apr 6, 2021

Currently the Sum aggregation was simply done with += which can lead to inaccuracies when working with high-precision values.
The Kahan Summation algorithm checks for such inaccuracies by performing the sum and keeping track of lost precision points.

Edit 1: Updated to account for new values being larger than running sum

@elasticsearchmachine elasticsearchmachine added the external-contributor Pull request authored by a developer outside the Elasticsearch team label Apr 6, 2021
This takes into account the added value being higher than the running sum
@cbuescher cbuescher added the :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data label Apr 22, 2021
@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Apr 22, 2021
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@nik9000
Copy link
Copy Markdown
Member

nik9000 commented Apr 22, 2021

For what it's worth the sum does use Kahan on the data nodes. This is for rollup. @csoulios might be the best to have a look.

@csoulios
Copy link
Copy Markdown
Contributor

@zkscpqm thank you for contributing this improvement.

There is already an implementation of KahanSummation implemented in CompensatedSum

Please have a look

final CompensatedSum kahanSummation = new CompensatedSum(0, 0);

Do you think you could modify your code so that it uses CompensatedSum?

@pugnascotia
Copy link
Copy Markdown
Contributor

@zkscpqm are you still interested in working on this PR?

@csoulios csoulios self-assigned this Jun 1, 2022
@csoulios
Copy link
Copy Markdown
Contributor

csoulios commented Jun 9, 2022

This PR has been idle for long.

Also, this feature has been implemented for the TSDB downsampling project (#87554)

Closing it.

@csoulios csoulios closed this Jun 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

external-contributor Pull request authored by a developer outside the Elasticsearch team :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants