shouldRollGeneration should execute under read lock#41696
shouldRollGeneration should execute under read lock#41696dnhatn merged 4 commits intoelastic:masterfrom
Conversation
|
Pinging @elastic/es-distributed |
henningandersen
left a comment
There was a problem hiding this comment.
LGTM.
I you know ways in which this could manifest itself, it would be nice to include into the commit message. I think this should in reality never show, but is easier to reason about with the lock check included.
|
I am not sure, is this actually a problem? There are issues here which this doesn't protect against. Namely the check and roll are not atomic, so it could be that a thread checks if a roll is needed, and then a roll occurs, and then this thread returns back and performs a roll itself. This change doesn't protect against that. |
|
@henningandersen @jasontedor Thanks for looking. I spotted this while I was investigating a case where more than 140 empty translog files in a single shard. For sure this is not the reason for those empty translog files. That shard has more than 280 primary failovers, and we roll a new translog on each primary promotion. Although this should never happen in practice, I think this method should be called with lock. I marked this PR as a non-issue. |
|
I agree it's more correct and should be integrated, my point though is that I don't think it matters in practice yet it still leaves behind real issues. 🤷♀ |
|
@henningandersen @jasontedor Thanks for reviewing! |
run elasticsearch-ci/1 |
Translog#shouldRollGeneration should execute under the read lock since it accesses the current writer.
* elastic/master: (84 commits) [ML] adds geo_centroid aggregation support to data frames (elastic#42088) Add documentation for calendar/fixed intervals (elastic#41919) Remove global checkpoint assertion in peer recovery (elastic#41987) Don't create tempdir for cli scripts (elastic#41913) Fix debian-8 update (elastic#42056) Cleanup plugin bin directories (elastic#41907) Prevent order being lost for _nodes API filters (elastic#42045) Change IndexAnalyzers default analyzer access (elastic#42011) Remove reference to fs.data.spins in docs Mute failing AsyncTwoPhaseIndexerTests Remove close method in PageCacheRecycler/Recycler (elastic#41917) [ML] adding pivot.max_search_page_size option for setting paging size (elastic#41920) Docs: Tweak list formatting Simplify handling of keyword field normalizers (elastic#42002) [ML] properly nesting objects in document source (elastic#41901) Remove extra `ms` from log message (elastic#42068) Increase the sample space for random inner hits name generator (elastic#42057) Recognise direct buffers in heap size docs (elastic#42070) shouldRollGeneration should execute under read lock (elastic#41696) Wait for active shard after close in mixed cluster (elastic#42029) ...
Translog#shouldRollGeneration should execute under the read lock since it accesses the current writer.
Translog#shouldRollGeneration should execute under the read lock for it accesses the current writer.