Create cache files with CREATE_NEW & SPARSE options#79371
Merged
elasticsearchmachine merged 4 commits intoelastic:masterfrom Oct 19, 2021
Merged
Create cache files with CREATE_NEW & SPARSE options#79371elasticsearchmachine merged 4 commits intoelastic:masterfrom
elasticsearchmachine merged 4 commits intoelastic:masterfrom
Conversation
Collaborator
|
Pinging @elastic/es-distributed (Team:Distributed) |
original-brownbear
approved these changes
Oct 19, 2021
Contributor
original-brownbear
left a comment
There was a problem hiding this comment.
LGTM, though maybe adjust the naming a little :)
| * Indicates if the file should be created when it is open for the first time. | ||
| * This is required to pass the right options for sparse file support. | ||
| */ | ||
| private volatile boolean created; |
Contributor
There was a problem hiding this comment.
Maybe name this fileExists and just describe it as "true if the physical cache file exists on disk" or so? I find the description and name very confusing tbh.
ywelsch
approved these changes
Oct 19, 2021
| * Indicates if the file should be created when it is open for the first time. | ||
| * This is required to pass the right options for sparse file support. | ||
| */ | ||
| private volatile boolean created; |
Member
Author
|
Thanks Yannick and Armin! |
tlrx
added a commit
to tlrx/elasticsearch
that referenced
this pull request
Oct 19, 2021
* Create cache files with CREATE_NEW & SPARSE options * assert + doc * rename
Collaborator
💔 Backport failed
You can use sqren/backport to manually backport by running |
tlrx
added a commit
to tlrx/elasticsearch
that referenced
this pull request
Oct 19, 2021
elasticsearchmachine
pushed a commit
that referenced
this pull request
Oct 19, 2021
tlrx
added a commit
to tlrx/elasticsearch
that referenced
this pull request
Oct 19, 2021
This commit adds the bug elastic#79371 as a known issue in documentation from 7.12.0 to 7.15.1.
weizijun
added a commit
to weizijun/elasticsearch
that referenced
this pull request
Oct 19, 2021
* upstream/master: Validate tsdb's routing_path (elastic#79384) Adjust the BWC version for the return200ForClusterHealthTimeout field (elastic#79436) API for adding and removing indices from a data stream (elastic#79279) Exposing the ability to log deprecated settings at non-critical level (elastic#79107) Convert operator privilege license object to LicensedFeature (elastic#79407) Mute SnapshotBasedIndexRecoveryIT testSeqNoBasedRecoveryIsUsedAfterPrimaryFailOver (elastic#79456) Create cache files with CREATE_NEW & SPARSE options (elastic#79371) Revert "[ML] Use a new annotations index for future annotations (elastic#79151)" [ML] Use a new annotations index for future annotations (elastic#79151) [ML] Removing legacy code from ML/transform auditor (elastic#79434) Fix rate agg with custom `_doc_count` (elastic#79346) Optimize SLM Policy Queries (elastic#79341) Fix execution of exists query within nested queries on field with doc_values disabled (elastic#78841) Stricter UpdateSettingsRequest parsing on the REST layer (elastic#79227) Do not release snapshot file download permit during recovery retries (elastic#79409) Preserve request headers in a mixed version cluster (elastic#79412) Adjust versions after elastic#79044 backport to 7.x (elastic#79424) Mute BulkByScrollUsesAllScrollDocumentsAfterConflictsIntegTests (elastic#79429) Fail on SSPL licensed x-pack sources (elastic#79348) # Conflicts: # server/src/test/java/org/elasticsearch/index/TimeSeriesModeTests.java
elasticsearchmachine
pushed a commit
that referenced
this pull request
Oct 19, 2021
tlrx
added a commit
that referenced
this pull request
Oct 19, 2021
tlrx
added a commit
to tlrx/elasticsearch
that referenced
this pull request
Oct 19, 2021
This commit adds the bug elastic#79371 as a known issue in documentation from 7.12.0 to 7.15.1. Backport of elastic#79473
tlrx
added a commit
to tlrx/elasticsearch
that referenced
this pull request
Oct 19, 2021
This commit adds the bug elastic#79371 as a known issue in documentation from 7.12.0 to 7.15.1. Backport of elastic#79473
elasticsearchmachine
pushed a commit
that referenced
this pull request
Oct 19, 2021
tlrx
added a commit
to tlrx/elasticsearch
that referenced
this pull request
Oct 19, 2021
This commit adds the bug elastic#79371 as a known issue in documentation from 7.12.0 to 7.15.1. Backport of elastic#79473
tlrx
added a commit
to tlrx/elasticsearch
that referenced
this pull request
Oct 19, 2021
This commit adds the bug elastic#79371 as a known issue in documentation from 7.12.0 to 7.15.1. Backport of elastic#79473
elasticsearchmachine
pushed a commit
that referenced
this pull request
Oct 19, 2021
elasticsearchmachine
pushed a commit
that referenced
this pull request
Oct 19, 2021
elasticsearchmachine
pushed a commit
that referenced
this pull request
Oct 19, 2021
tlrx
added a commit
that referenced
this pull request
Nov 5, 2021
…n disk for file (#79698) In #79371 we fixed a bug where cache files were not created as sparse files on Windows platforms because the wrong options were used when creating the files for the first time. This bug got unnoticed as we were lacking a way to retrieve the exact number of bytes allocated for a given file on disk. This commit adds a FileSystemNatives.allocatedSizeInBytes(Path) method for that exact purpose (only implemented for Windows for now) and a test in CacheFileTests that would fail on Windows if the cache file is not sparse. Relates #79371
tlrx
added a commit
to tlrx/elasticsearch
that referenced
this pull request
Nov 5, 2021
…n disk for file (elastic#79698) In elastic#79371 we fixed a bug where cache files were not created as sparse files on Windows platforms because the wrong options were used when creating the files for the first time. This bug got unnoticed as we were lacking a way to retrieve the exact number of bytes allocated for a given file on disk. This commit adds a FileSystemNatives.allocatedSizeInBytes(Path) method for that exact purpose (only implemented for Windows for now) and a test in CacheFileTests that would fail on Windows if the cache file is not sparse. Relates elastic#79371
tlrx
added a commit
to tlrx/elasticsearch
that referenced
this pull request
Nov 5, 2021
…n disk for file (elastic#79698) In elastic#79371 we fixed a bug where cache files were not created as sparse files on Windows platforms because the wrong options were used when creating the files for the first time. This bug got unnoticed as we were lacking a way to retrieve the exact number of bytes allocated for a given file on disk. This commit adds a FileSystemNatives.allocatedSizeInBytes(Path) method for that exact purpose (only implemented for Windows for now) and a test in CacheFileTests that would fail on Windows if the cache file is not sparse. Relates elastic#79371
elasticsearchmachine
pushed a commit
that referenced
this pull request
Nov 5, 2021
…n disk for file (#79698) (#80426) In #79371 we fixed a bug where cache files were not created as sparse files on Windows platforms because the wrong options were used when creating the files for the first time. This bug got unnoticed as we were lacking a way to retrieve the exact number of bytes allocated for a given file on disk. This commit adds a FileSystemNatives.allocatedSizeInBytes(Path) method for that exact purpose (only implemented for Windows for now) and a test in CacheFileTests that would fail on Windows if the cache file is not sparse. Relates #79371 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
elasticsearchmachine
pushed a commit
that referenced
this pull request
Nov 5, 2021
…n disk for file (#79698) (#80427) In #79371 we fixed a bug where cache files were not created as sparse files on Windows platforms because the wrong options were used when creating the files for the first time. This bug got unnoticed as we were lacking a way to retrieve the exact number of bytes allocated for a given file on disk. This commit adds a FileSystemNatives.allocatedSizeInBytes(Path) method for that exact purpose (only implemented for Windows for now) and a test in CacheFileTests that would fail on Windows if the cache file is not sparse. Relates #79371 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Cold and Frozen tiers use a local cache for some Lucene files. This cache is based on files stored on disk, and to limit the required disk space it relies on the operating system's support for sparse files.
When Elasticsearch creates the cache files it uses the following standard options:
Sadly those options are not enough to enable sparse file for the cache files. Instead it should use the
CREATE_NEWoption, as it is documented in the Java NIO API:The consequence today is that sparse file support if not enable for operating systems that do not enable it by default, like NTFS.
Since partially mounted indices ignores disk watermarks and report a size on disk of 0 bytes it means that the cache files can grow and use much more disk space than needed, potentially filling up the disk.
Operating systems that enables sparse file by default, like most Linux distributions, should not be affected.
This pull request adds a
createdflag to the CacheFile class that indicates if the file is expected to already exist on disk or not. This flag is used to create or open the file with the correct options.This bug will be documented in a follow up. I also plan to spend some time looking if we can use JNA to add assertions that the shared cache file is effectively sparse on Windows/Linux etc.