Skip to content

Conversation

@abmo-x
Copy link
Contributor

@abmo-x abmo-x commented Mar 19, 2024

This is fix for #9997

Root cause

s3a putObject was interrupted due to flink pipeline failure while the object was being uploaded. As this interrupt is not handled and thrown as an exception, the metadata writer assumes write was successful which results in table pointing to a metadata json file that doesn't exist.

Hadoop S3A code

This is the s3a code block which is called from stream close to putObject which is the root cause.

https://github.com/apache/hadoop/blob/0f51d2a4ec17bad754beb17048409811a151be53/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3ABlockOutputStream.java#L636

try {
      putObjectResult.get();
      return size;
    } catch (InterruptedException ie) {
      LOG.warn("Interrupted object upload", ie);
      Thread.currentThread().interrupt();
      return 0;
    } catch (ExecutionException ee) {
      throw extractException("regular upload", key, ee);
    }

Iceberg commit successful on upload interrupt

2024-03-06T23:06:36.160+00:00 INFO org.apache.iceberg.hive.HiveTableOperations Committed to table iceberg.default.some_table with the new metadata location s3a://bucket/prefix/default.db/some_table/metadata/52949-1e28478a-bf5e-4ec0-8d2c-......metadata.json

But request interrupted
2024-03-06T23:06:36.024+00:00 WARN org.apache.hadoop.fs.s3a.S3ABlockOutputStream Interrupted object upload

@github-actions github-actions bot added the core label Mar 19, 2024
@abmo-x
Copy link
Contributor Author

abmo-x commented Mar 19, 2024

@rdblue
I reproduced and verified this locally with an Integration test.

I am wondering whats the best way to include a test for this. I didn't see any tests using hadoop-aws module or s3a.

cc @nastra @ajantha-bhat appreciate any inputs to test this with hadoop fs with s3a

@abmo-x abmo-x changed the title [core] fix #9997 - throw io exception on close if S3a put object interrupted [core] fix #9997 - Handle s3a file upload interrupt which results in table metadata pointing to files that don't exists Mar 19, 2024
@abmo-x abmo-x changed the title [core] fix #9997 - Handle s3a file upload interrupt which results in table metadata pointing to files that don't exists [core] fix #9997 - Handle s3a file upload interrupt which results in table metadata pointing to files that doesn't exist Mar 19, 2024
Copy link
Contributor

@stevenzwu stevenzwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM although it looks a bit hacky. let's wait for feedbacks from others @rdblue @RussellSpitzer etc.

@abmo-x
Copy link
Contributor Author

abmo-x commented Mar 20, 2024

Added a unit test

@stevenzwu if you can take a look again. Thanks

Copy link
Member

@RussellSpitzer RussellSpitzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know enough about this but it feels like a bug in the Hadoop utility methods here. Do we have a Hadoop issue tracking this as well? I don't have a problem with us patching Iceberg to work around the issue if it is this kind of otherwise silent failure.

@abmo-x
Copy link
Contributor Author

abmo-x commented Mar 20, 2024

@RussellSpitzer
Thanks for the review.

waiting for access to create a issue in the hadoop project.

Looks like the hadoop projects uses this pattern widely and they check for interrupts everywhere https://github.com/search?q=repo%3Aapache%2Fhadoop%20Thread.interrupted&type=code. Not sure why interrupt is preferred vs throwing an exception.

PositionOutputStream wrap = HadoopStreams.wrap(fsDataOutputStream);

// mock interrupt in S3ABlockOutputStream#putObject
Thread.currentThread().interrupt();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this test the same scenario. ideally the mock object's close method can bloc/sleep. then the interruption should come from another thread.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does check the scenario, if I remove the fix this test fails. This works as the interrupt is set in the same thread in https://github.com/apache/hadoop/blob/0f51d2a4ec17bad754beb17048409811a151be53/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3ABlockOutputStream.java#L635 as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understood that test fails if fix is removed. but the scenario is not exactly the same, interrupt typically wasn't interrupted from the same thread. usually another thread is trying to interrupt a thread.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes but do we have to mock another thread interrupting current thread here if we are able to set the interrupt and mimic the behavior.

I can start another thread in the test [S3ABlockOutputStream.java]'s close method and then interrupt it to set the interrupted flag but I feel that is not needed but nice to have as eventually all this will do is do what line #38 here is doing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer that we change the test to interrupt the close from another thread. the close can sleep/block for 10 mins. interrupt should cancel the the close action and the assertion is that close takes less than 10 mins

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

@nastra nastra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM, but I think the error msg could be improved

@abmo-x
Copy link
Contributor Author

abmo-x commented Mar 22, 2024

Thanks @nastra @stevenzwu @RussellSpitzer for the review. Addressed all comments, can we merge this?


long endTime = System.currentTimeMillis();
long closeDuration = endTime - startTime;
Assert.assertTrue(closeDuration < 30 * 1000 && closeDuration > 1000);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I suggested this earlier. seems like the error msg check might be enough to verify the interrupt behavior. I am ok to remove the time check, which might cause flakiness.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, removed

@stevenzwu stevenzwu merged commit 6d6fd0b into apache:main Mar 29, 2024
nk1506 pushed a commit to nk1506/iceberg that referenced this pull request Apr 2, 2024
…ts in table metadata pointing to files that doesn't exist (apache#9998)

Co-authored-by: Abid Mohammed <abid_mohammed@apple.com>
sasankpagolu pushed a commit to sasankpagolu/iceberg that referenced this pull request Oct 27, 2024
…ts in table metadata pointing to files that doesn't exist (apache#9998)

Co-authored-by: Abid Mohammed <abid_mohammed@apple.com>
zachdisc pushed a commit to zachdisc/iceberg that referenced this pull request Dec 23, 2024
…ts in table metadata pointing to files that doesn't exist (apache#9998)

Co-authored-by: Abid Mohammed <abid_mohammed@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants