Add Restore Operation to SnapshotResiliencyTests#40634
Add Restore Operation to SnapshotResiliencyTests#40634original-brownbear merged 7 commits intoelastic:masterfrom original-brownbear:more-snapshot-resiliency-testing-2
Conversation
* Expand the successful snapshot test case to also include restoring the snapshop * Add indexing of documents as well to be able to meaningfully verify the restore * This is part of the larger effort to test eventually consistent blob stores in #39504
|
Pinging @elastic/es-distributed |
|
Jenkins run elasticsearch-ci/2 |
andrershov
left a comment
There was a problem hiding this comment.
I've left two questions.
And I wonder if we want to add "restore" operation to other tests in this class as well? For example, when emulating cluster disruption?
I also want to better understand the motivation for this PR.
Is it primarily here to be able more different kind of actions in future tests?
| } | ||
| )); | ||
| })))); | ||
| runUntil( |
There was a problem hiding this comment.
Why do we need 3 different atomic variables? Is not it enough just to have documentCountVerified?
There was a problem hiding this comment.
Right ... not sure what I was thinking :) my bad fixed in c064db3
server/src/test/java/org/elasticsearch/snapshots/SnapshotResiliencyTests.java
Outdated
Show resolved
Hide resolved
|
@andrershov The motivation for this is to basically get tests that verify that we're functional with an eventually consistent blob store (the code for that is already written in #39504) and for that we really want/need to verify that the snapshots can actually be restored. Especially in light of the fact that currently we suspect failures from eventually consistent list operations that silently corrupt the repository (#38941).
Well the other ones kinda reproduce broken snapshots and we mainly cared about the fact that these snapshots finished back when we added them since the bugs resolved at the time led to various scenarios where snapshots just got stuck forever. |
|
@original-brownbear so the idea is to extend testSuccessfulSnapshot with restore and call it testSuccessfulSnapshotAndRestore. And the next PR will add eventual consistent BlobStore emulation. So we can just run testSuccessfulSnapshotAndRestore for long enough time to make sure it does not fail with the eventual consistent store. Do I understand correctly, that currently, we expect it to fail or do we already have logic to work around eventual consistency? |
Yea, exactly.
Correct as well.
Correct as well here too. Currently the mock eventually consistent store fails yes, but I have a fix for all cases the test reproduces (tests are done too in the other PR I linked) in https://github.com/original-brownbear/elasticsearch/tree/delete-lock-via-cs (95% done, would open a PR for that once we have the tests in place). |
|
thanks @andrershov ! |
…leniency * elastic/master: SQL: Fix deserialisation issue of TimeProcessor (elastic#40776) Improve GCS docs for using keystore (elastic#40605) Add Restore Operation to SnapshotResiliencyTests (elastic#40634) Small refactorings to analysis components (elastic#40745) SQL: Fix display size for DATE/DATETIME (elastic#40669) add HLRC protocol tests for transform state and stats (elastic#40766) Inline TransportReplAction#registerRequestHandlers (elastic#40762) remove experimental label from search_as_you_type documentation (elastic#40744) Remove some abstractions from `TransportReplicationAction` (elastic#40706) Upgrade to latest build scan plugin (elastic#40702) Use default memory lock setting in testing (elastic#40730) Add Bulk Delete Api to BlobStore (elastic#40322) Remove yaml skips older than 7.0 (elastic#40183) Docs: Move id in the java-api (elastic#40748)
* Add Restore Operation to SnapshotResiliencyTests * Expand the successful snapshot test case to also include restoring the snapshop * Add indexing of documents as well to be able to meaningfully verify the restore * This is part of the larger effort to test eventually consistent blob stores in #39504
* Add Restore Operation to SnapshotResiliencyTests * Expand the successful snapshot test case to also include restoring the snapshop * Add indexing of documents as well to be able to meaningfully verify the restore * This is part of the larger effort to test eventually consistent blob stores in elastic#39504
Note: I had to add putting a mapping before indexing documents here since dynamic mapping updates trigger a blocking operation that blocks the tests. A fix for that is incoming in #39793