Skip to content

[BUG] S3 Source stops on S3 error #1544

@dlvenable

Description

@dlvenable

Describe the bug

When a failure occurred reading from an S3 object in the s3 source, Data Prepper stopped processing the SQS queue and stopped reading S3 objects.

To Reproduce
Steps to reproduce the behavior:

  1. Create an IAM role that gives permission to an SQS queue, but not the S3 bucket
  2. Configure Data Prepper to use that role and read from the SQS queue
  3. Run Data Prepper
  4. You see an exception and then Data Prepper stops

Expected behavior

Data Prepper should log the exception as it can be useful to the operator. However, it should continue to process from the SQS queue. This will be important when transient issues occur.

Additionally, Data Prepper probably should pause shortly after any error rather than immediately retry.

Additional context

Below is the sample stack trace.

2022-06-27T14:52:31,322 [Thread-1] ERROR com.amazon.dataprepper.plugins.source.S3ObjectWorker - Error reading from S3 object: s3ObjectReference=[bucketName=***, key=***].
software.amazon.awssdk.services.s3.model.S3Exception: Access Denied (Service: S3, Status Code: 403, Request ID: FURZ9T6YXPTW27BV, Extended Request ID: YGQ9eZoetO8IXMtzIJC69PtycUwwvfGe9/fRXKNDP3W4MVF53qiit1V8+LmYoQnjdtMV8ToXba4=)
	at software.amazon.awssdk.core.internal.http.CombinedResponseHandler.handleErrorResponse(CombinedResponseHandler.java:125) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.CombinedResponseHandler.handleResponse(CombinedResponseHandler.java:82) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.CombinedResponseHandler.handle(CombinedResponseHandler.java:60) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.CombinedResponseHandler.handle(CombinedResponseHandler.java:41) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:40) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:30) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:73) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:42) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:78) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:40) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:50) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:36) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:81) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:36) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:56) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:36) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:80) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:60) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:48) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:31) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:193) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.invoke(BaseSyncClientHandler.java:103) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.doExecute(BaseSyncClientHandler.java:167) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$0(BaseSyncClientHandler.java:68) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:175) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:62) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:52) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:63) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.services.s3.DefaultS3Client.getObject(DefaultS3Client.java:4483) ~[data-prepper.jar:1.5.0]
	at software.amazon.awssdk.services.s3.S3Client.getObject(S3Client.java:7889) ~[data-prepper.jar:1.5.0]
	at com.amazon.dataprepper.plugins.source.S3ObjectWorker.doParseObject(S3ObjectWorker.java:100) ~[data-prepper.jar:1.5.0]
	at com.amazon.dataprepper.plugins.source.S3ObjectWorker.lambda$parseS3Object$0(S3ObjectWorker.java:84) ~[data-prepper.jar:1.5.0]
	at io.micrometer.core.instrument.composite.CompositeTimer.recordCallable(CompositeTimer.java:77) ~[data-prepper.jar:1.5.0]
	at com.amazon.dataprepper.plugins.source.S3ObjectWorker.parseS3Object(S3ObjectWorker.java:83) ~[data-prepper.jar:1.5.0]
	at com.amazon.dataprepper.plugins.source.S3Service.addS3Object(S3Service.java:53) ~[data-prepper.jar:1.5.0]
	at com.amazon.dataprepper.plugins.source.SqsWorker.processS3Objects(SqsWorker.java:155) ~[data-prepper.jar:1.5.0]
	at com.amazon.dataprepper.plugins.source.SqsWorker.processSqsMessages(SqsWorker.java:95) ~[data-prepper.jar:1.5.0]
	at com.amazon.dataprepper.plugins.source.SqsWorker.run(SqsWorker.java:72) ~[data-prepper.jar:1.5.0]
	at java.lang.Thread.run(Thread.java:832) ~[?:?]
Exception in thread "Thread-1" software.amazon.awssdk.services.s3.model.S3Exception: Access Denied (Service: S3, Status Code: 403, Request ID: YGR59TQYXBQW2W1H, Extended Request ID: 4kQ5eeC6tO8IXMtzIJC6X9RIcRw7ZgGe6/fZW8NDI3W0NVN53qiit1V8+QmZ4QPjdtnV9ToxB14=)
	at software.amazon.awssdk.core.internal.http.CombinedResponseHandler.handleErrorResponse(CombinedResponseHandler.java:125)
	at software.amazon.awssdk.core.internal.http.CombinedResponseHandler.handleResponse(CombinedResponseHandler.java:82)
	at software.amazon.awssdk.core.internal.http.CombinedResponseHandler.handle(CombinedResponseHandler.java:60)
	at software.amazon.awssdk.core.internal.http.CombinedResponseHandler.handle(CombinedResponseHandler.java:41)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:40)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:30)
	at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:73)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:42)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:78)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:40)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:50)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:36)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:81)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:36)
	at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
	at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:56)
	at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:36)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:80)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:60)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:48)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:31)
	at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
	at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26)
	at software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:193)
	at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.invoke(BaseSyncClientHandler.java:103)
	at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.doExecute(BaseSyncClientHandler.java:167)
	at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$0(BaseSyncClientHandler.java:68)
	at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:175)
	at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:62)
	at software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:52)
	at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:63)
	at software.amazon.awssdk.services.s3.DefaultS3Client.getObject(DefaultS3Client.java:4483)
	at software.amazon.awssdk.services.s3.S3Client.getObject(S3Client.java:7889)
	at com.amazon.dataprepper.plugins.source.S3ObjectWorker.doParseObject(S3ObjectWorker.java:100)
	at com.amazon.dataprepper.plugins.source.S3ObjectWorker.lambda$parseS3Object$0(S3ObjectWorker.java:84)
	at io.micrometer.core.instrument.composite.CompositeTimer.recordCallable(CompositeTimer.java:77)
	at com.amazon.dataprepper.plugins.source.S3ObjectWorker.parseS3Object(S3ObjectWorker.java:83)
	at com.amazon.dataprepper.plugins.source.S3Service.addS3Object(S3Service.java:53)
	at com.amazon.dataprepper.plugins.source.SqsWorker.processS3Objects(SqsWorker.java:155)
	at com.amazon.dataprepper.plugins.source.SqsWorker.processSqsMessages(SqsWorker.java:95)
	at com.amazon.dataprepper.plugins.source.SqsWorker.run(SqsWorker.java:72)
	at java.base/java.lang.Thread.run(Thread.java:832)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    Done

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions