Skip to content

Fixes DocStats to properly deal with shards that report -1 index size#27863

Merged
colings86 merged 1 commit intoelastic:6.xfrom
colings86:fix/docstatsIndexSize
Dec 20, 2017
Merged

Fixes DocStats to properly deal with shards that report -1 index size#27863
colings86 merged 1 commit intoelastic:6.xfrom
colings86:fix/docstatsIndexSize

Conversation

@colings86
Copy link
Copy Markdown
Contributor

Previously to this change when DocStats are added together (for example when adding the index size of all primary shards for an index) we naively added the totalSizeInBytes together. This worked most of the time but not when the index size on one or multiple shards was reported to be -1 (no value).

This change improves the logic by considering if the current value or the value to be added is -1:

  • If the current and new value are both -1 the value remains at -1
  • If the current value is -1 and the new value is not -1, current value is changed to be equal to the new value
  • If the current value is not -1 and the new value is -1 the new value is ignored and the current value is not changed
  • If both the current and new values are not -1 the current value is changed to be equal to the sum of the current and new values.

The change also re-enables the failing rollover YAML test that was failing due to this bug.

@colings86 colings86 added :Core/Infra/Stats Statistics tracking and retrieval APIs >bug review v6.2.0 v7.0.0 labels Dec 18, 2017
@colings86 colings86 self-assigned this Dec 18, 2017
@colings86 colings86 changed the title Fixes DocStats to not report index size < -1 Fixes DocStats to properly deal with shards that report -1 index size Dec 18, 2017
@colings86
Copy link
Copy Markdown
Contributor Author

Note that this PR is against the 6.x branch as that is where the YAML test failed and therefore where I found the bug. I can either forward port this to master when merged or I can rebase against master as suggested by the reviewer(s).

Also note that I'm not 100% sure if this is the right fix for this I am open to better suggestions for how we should deal with this case (maybe we should never report the index size as -1 in the stats?) but this seemed like the least breaking change.

@colings86
Copy link
Copy Markdown
Contributor Author

Also note that the reason the stats report -1 currently is that DocStats are created in IndexShard.docStats() which gets its value from the Lucene class SegmentCommitInfo which initialises sizeInBytes to -1.

Copy link
Copy Markdown
Member

@jasontedor jasontedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left two comments.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover import?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I'll remove

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these the right versions?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think so, these version limits were on the test prior to me muting it, see 1e9e73a

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but what about mixed cluster tests where older nodes will be running the buggy code?

Copy link
Copy Markdown
Contributor Author

@colings86 colings86 Dec 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be ok (though I'm willing to be proven wrong 😄 ) since the stats are combined on a single node (they are not incrementally added together on different nodes as far as I can see), so if this is running on a 5.x node it will run as it did in 5.x (possibly with negative BytesSizeValues but that is allowed on 5.x) and pass (as it does on the pure 5.x tests). If the test runs against a 6.x node it will have the DocsStats fix to ensure the sizes are added together correctly and the least value passed to the ByteSizeValue will be -1 which is allowed in 6.x.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jasontedor do you still have concerns here or are you happy for me to merge this PR?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying.

Copy link
Copy Markdown
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (this.totalSizeInBytes == -1) {
this.totalSizeInBytes = other.totalSizeInBytes;
} else if (other.totalSizeInBytes != -1) {
this.totalSizeInBytes += other.totalSizeInBytes;
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ changed in 6832292

DaveCTurner added a commit that referenced this pull request Dec 19, 2017
Follows on from 1e9e73a, awaiting the merge of
#27863
@DaveCTurner
Copy link
Copy Markdown
Member

DaveCTurner commented Dec 19, 2017

In c20d1dc I muted:

"Rollover index via API":
- skip:
version: "all"
reason: "AwaitsFix https://github.com/elastic/elasticsearch/pull/27863"

"Max docs rollover conditions matches only primary shards":
- skip:
version: "all"
reason: "AwaitsFix https://github.com/elastic/elasticsearch/pull/27863"

Also I added the URL of this PR to the previously-muted:

"Rollover no condition matched":
- skip:
version: "all"
reason: "AwaitsFix https://github.com/elastic/elasticsearch/pull/27863"

Please could you restore these?

@colings86
Copy link
Copy Markdown
Contributor Author

@DaveCTurner thanks, I've unmuted those tests in this PR

Previously to this change when DocStats are added together (for example when adding the index size of all primary shards for an index)  we naively added the `totalSizeInBytes` together. This worked most of the time but not when the index size on one or multiple shards was reported to be `-1` (no value).

This change improves the logic by considering if the current value or the value to be added is `-1`:
* If the current and new value are both `-1` the value remains at `-1`
* If the current value is `-1` and the new value is not `-1`, current value is changed to be equal to the new value
* If the current value is not `-1` and the new value is `-1` the new value is ignored and the current value is not changed
* If both the current and new values are not `-1` the current value is changed to be equal to the sum of the current and new values.

The change also re-enables the failing rollover YAML test that was failing due to this bug.
Copy link
Copy Markdown
Member

@jasontedor jasontedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@colings86 colings86 merged commit 64ab5f6 into elastic:6.x Dec 20, 2017
@colings86
Copy link
Copy Markdown
Contributor Author

I have marked this PR as backport pending but in fact in this case a forward port to master is required. I will do this as soon as I have ensured the 6.x intake build passes.

colings86 added a commit that referenced this pull request Dec 20, 2017
Previously to this change when DocStats are added together (for example when adding the index size of all primary shards for an index)  we naively added the `totalSizeInBytes` together. This worked most of the time but not when the index size on one or multiple shards was reported to be `-1` (no value).

This change improves the logic by considering if the current value or the value to be added is `-1`:
* If the current and new value are both `-1` the value remains at `-1`
* If the current value is `-1` and the new value is not `-1`, current value is changed to be equal to the new value
* If the current value is not `-1` and the new value is `-1` the new value is ignored and the current value is not changed
* If both the current and new values are not `-1` the current value is changed to be equal to the sum of the current and new values.

The change also re-enables the failing rollover YAML test that was failing due to this bug.
@colings86
Copy link
Copy Markdown
Contributor Author

Forward port is now done

@colings86 colings86 deleted the fix/docstatsIndexSize branch December 20, 2017 14:46
martijnvg added a commit that referenced this pull request Dec 21, 2017
* es/master: (45 commits)
  Adapt scroll rest test after backport. relates #27842
  Move early termination based on index sort to TopDocs collector (#27666)
  Upgrade beats templates that we use for bwc testing. (#27929)
  ingest: upgraded ingest geoip's geoip2's dependencies.
  [TEST] logging for update by query test #27820
  Add elasticsearch-nio jar for base nio classes (#27801)
  Use full profile on JDK 10 builds
  Require Gradle 4.3
  Enable grok processor to support long, double and boolean (#27896)
  Add unreleased v6.1.2 version
  TEST: reduce blob size #testExecuteMultipartUpload
  Check index under the store metadata lock (#27768)
  Fixes DocStats to not report index size < -1 (#27863)
  Fixed test to be up to date with the new database files.
  Upgrade to Lucene 7.2.0. (#27910)
  Disable TestZenDiscovery in cloud providers integrations test
  Use `_refresh` to shrink the version map on inactivity (#27918)
  Make KeyedLock reentrant (#27920)
  ingest: Upgraded the geolite2 databases.
  [Test] Fix IndicesClientDocumentationIT (#27899)
  ...
martijnvg added a commit that referenced this pull request Dec 21, 2017
* es/6.x: (43 commits)
  ingest: upgraded ingest geoip's geoip2's dependencies.
  [TEST] logging for update by query test #27820
  Use full profile on JDK 10 builds
  Require Gradle 4.3
  Add unreleased v6.1.2 version
  TEST: reduce blob size #testExecuteMultipartUpload
  Check index under the store metadata lock (#27768)
  Upgrade to Lucene 7.2.0. (#27910)
  Fixed test to be up to date with the new database files.
  Use `_refresh` to shrink the version map on inactivity (#27918)
  Make KeyedLock reentrant (#27920)
  Fixes DocStats to not report index size < -1 (#27863)
  Disable TestZenDiscovery in cloud providers integrations test
  ingest: Upgraded the geolite2 databases.
  [Issue-27716]: CONTRIBUTING.md IntelliJ configurations settings are confusing. (#27717)
  [Test] Fix IndicesClientDocumentationIT (#27899)
  Move uid lock into LiveVersionMap (#27905)
  Mute testRetentionPolicyChangeDuringRecovery
  Increase Gradle heap space to 1536m
  Move GlobalCheckpointTracker and remove SequenceNumbersService (#27837)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :Core/Infra/Stats Statistics tracking and retrieval APIs v6.2.0 v7.0.0-beta1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants