Add Bulk Delete Api to BlobStore#40322
Add Bulk Delete Api to BlobStore#40322original-brownbear merged 14 commits intoelastic:masterfrom original-brownbear:40250
Conversation
|
Pinging @elastic/es-distributed |
|
Jenkins run elasticsearch-ci/bwc |
andrershov
left a comment
There was a problem hiding this comment.
Thanks for the PR!
I've added a few comments. Also, I wonder if we need to add some kind of tests for deleteBlobs method?
plugins/repository-s3/src/main/java/org/elasticsearch/repositories/s3/S3BlobContainer.java
Outdated
Show resolved
Hide resolved
plugins/repository-s3/src/main/java/org/elasticsearch/repositories/s3/S3BlobContainer.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Will this implementation work for both cases? It seems that {bucket} is even not used in the implementation
There was a problem hiding this comment.
Yea, the unfortunate reality here is that the current fixtures logic for bulk delete doesn't even care about the bucket and simply tries to find the given blobs in any bucket.
I wouldn't really put any effort into this tbh. I think we should probably rather look into just removing the fixture now that we have the Docker-based Minio tests and third-party tests.
The fixture seems completely redundant now ...
server/src/main/java/org/elasticsearch/common/blobstore/BlobContainer.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/common/blobstore/BlobContainer.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/repositories/blobstore/BlobStoreRepository.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
If there is an IOException we do not proceed even if we have more DeleteRequests to be sent.
Previously when performing deletes - if one delete failed, we still were proceeding with the next delete requests.
There was a problem hiding this comment.
Right, fixed this for S3 as well as the generic case now by catching and aggregating exceptions in the loop :)
There was a problem hiding this comment.
We no longer know what particular indices are not removed. We just log all indices, including those that are successful.
The same thing applies to deleteBlobs usage below.
Probably we can add this kind of information to IOException thrown by deleteBlobs?
There was a problem hiding this comment.
Done in 9a13dadd0c1 :) We now aggregate all the exceptions. The IndexId that are printed here contain the name and the index snapshot id so with the information in the aggregate exception we can work out what index failed to clean up.
There was a problem hiding this comment.
Do you know how AWS S3 works when you send a bulk request of 1000 and entry number 500 fails? Will it stop at entry 500 or will it try to delete all entries from the bulk?
Will S3 include information about each failed entry to the exception? (or just the first one?)
There was a problem hiding this comment.
@andrershov it will try to delete all of the entries and given errors for all the ones that failed :)
|
@andrershov thanks for taking a look! All points addressed and tests added in |
andrershov
left a comment
There was a problem hiding this comment.
@original-brownbear mostly looks good. But before approving I want to understand how S3 deals with bulk requests in terms of exceptions when several deletions have failed.
I want to be sure that S3 will try to delete all the elements from the bulk even if some elements from the bulk have problems and that information about all failed elements will be included to the exception.
* Adds Bulk delete API to blob container * Implement bulk delete API for S3 * Adjust S3Fixture to accept both path styles for bulk deletes since the S3 SDK uses both during our ITs * Closes #40250
|
@andrershov fixed the docs on |
|
@andrershov thanks |
ywelsch
left a comment
There was a problem hiding this comment.
Thanks @original-brownbear. Looking very good already. I've left some minor points
| class S3BlobContainer extends AbstractBlobContainer { | ||
|
|
||
| /** | ||
| * Maximum number of deletes in a {@link DeleteObjectsRequest}. |
There was a problem hiding this comment.
perhaps link to AWS docs here
| } | ||
|
|
||
| @Override | ||
| public void deleteBlobs(List<String> blobNames) throws IOException { |
There was a problem hiding this comment.
I think we should call this deleteBlobsIgnoringIfNotExists?
| "but failed to clean up its index folder.", metadata.name(), indexId), ioe); | ||
| logger.warn(() -> | ||
| new ParameterizedMessage( | ||
| "[{}] indices [{}] are no longer part of any snapshots in the repository, " + |
There was a problem hiding this comment.
Second placeholder can be just {} instead of [{}] as it's a list that will render anyway with the brackets. Same issue in other places in this PR
| snapshotId, shardId, blobName), e); | ||
| } | ||
| } | ||
| final List<String> staleBlobs = blobs.keySet().stream() |
There was a problem hiding this comment.
perhaps call this orphanedBlobs?
|
thanks @ywelsch |
…leniency * elastic/master: SQL: Fix deserialisation issue of TimeProcessor (elastic#40776) Improve GCS docs for using keystore (elastic#40605) Add Restore Operation to SnapshotResiliencyTests (elastic#40634) Small refactorings to analysis components (elastic#40745) SQL: Fix display size for DATE/DATETIME (elastic#40669) add HLRC protocol tests for transform state and stats (elastic#40766) Inline TransportReplAction#registerRequestHandlers (elastic#40762) remove experimental label from search_as_you_type documentation (elastic#40744) Remove some abstractions from `TransportReplicationAction` (elastic#40706) Upgrade to latest build scan plugin (elastic#40702) Use default memory lock setting in testing (elastic#40730) Add Bulk Delete Api to BlobStore (elastic#40322) Remove yaml skips older than 7.0 (elastic#40183) Docs: Move id in the java-api (elastic#40748)
* Implement Bulk Deletes for GCS Repository * Just like #40322 for AWS * We already had a bulk delete API but weren't using it from the blob container implementation, now we are using it * Made the bulk delete API also compliant with our interface that only suppresses errors about non existent blobs by stating failed deletes (I didn't use any bulk stat action here since having to stat here should be the exception anyway and it would make error handling a lot more complex) * Fixed bulk delete API to limit its batch size to 100 in line with GCS recommendations
* Implement Bulk Deletes for GCS Repository (#41368) * Just like #40322 for AWS * We already had a bulk delete API but weren't using it from the blob container implementation, now we are using it * Made the bulk delete API also compliant with our interface that only suppresses errors about non existent blobs by stating failed deletes (I didn't use any bulk stat action here since having to stat here should be the exception anyway and it would make error handling a lot more complex) * Fixed bulk delete API to limit its batch size to 100 in line with GCS recommendations back port of #41368
* Implement Bulk Deletes for GCS Repository * Just like elastic#40322 for AWS * We already had a bulk delete API but weren't using it from the blob container implementation, now we are using it * Made the bulk delete API also compliant with our interface that only suppresses errors about non existent blobs by stating failed deletes (I didn't use any bulk stat action here since having to stat here should be the exception anyway and it would make error handling a lot more complex) * Fixed bulk delete API to limit its batch size to 100 in line with GCS recommendations
* Adds Bulk delete API to blob container * Implement bulk delete API for S3 * Adjust S3Fixture to accept both path styles for bulk deletes since the S3 SDK uses both during our ITs * Closes elastic#40250
* Implement Bulk Deletes for GCS Repository * Just like elastic#40322 for AWS * We already had a bulk delete API but weren't using it from the blob container implementation, now we are using it * Made the bulk delete API also compliant with our interface that only suppresses errors about non existent blobs by stating failed deletes (I didn't use any bulk stat action here since having to stat here should be the exception anyway and it would make error handling a lot more complex) * Fixed bulk delete API to limit its batch size to 100 in line with GCS recommendations
For S3 this should pretty much give us almost 1000x speedup for deletes of huge indices/shards.