Skip to content

Conversation

@manuzhang
Copy link
Member

@manuzhang manuzhang commented Feb 1, 2024

When partial progress is enabled, rewrite_data_files Spark app can succeed without any "progress" (commits). This PR adds an option partial-progress.max-failed-commits and a RuntimeException will be thrown if the number of failed commits exceed it. The default value is partital-progress.max-commits to be consistent with current behavior.

@manuzhang
Copy link
Member Author

@amogh-jahagirdar This is based on our discussion in #9400, but I'd like to go one step further. Throwing exception and fail can work with alert system easier than logging an error. WDYT?

} else if (failedCommits > maxFailedCommits) {
String errorMessage =
String.format(
"%s is true but %d rewrite commits failed. This is more than the maximum allowed failures of %d. "
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should mention this message is generated by GitHub copilot.

@manuzhang manuzhang force-pushed the partial-progress-max-failed-commits branch 2 times, most recently from 4827464 to 8faaa67 Compare February 23, 2024 15:28
@github-actions github-actions bot added the INFRA label Feb 23, 2024
@manuzhang manuzhang force-pushed the partial-progress-max-failed-commits branch from 8faaa67 to cbd5198 Compare February 24, 2024 15:16
@github-actions github-actions bot removed the INFRA label Feb 24, 2024
@manuzhang manuzhang force-pushed the partial-progress-max-failed-commits branch from cbd5198 to 0e09e09 Compare April 9, 2024 10:24
@aokolnychyi
Copy link
Contributor

Sorry for the delay. I didn't forget.

Copy link
Contributor

@aokolnychyi aokolnychyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me.

@manuzhang manuzhang force-pushed the partial-progress-max-failed-commits branch from 0e09e09 to 8069af1 Compare April 12, 2024 03:29
String.format(
"%s is true but %d rewrite commits failed. This is more than the maximum allowed failures of %d. "
+ "Check the logs to determine why the individual commits failed. If this is persistent it may help to "
+ "increase %s which will break the rewrite operation into smaller commits.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: instead of break maybe use split?

table.refresh();

List<Object[]> postRewriteData = currentData();
assertEquals("We shouldn't have changed the data", originalData, postRewriteData);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assertEquals("We shouldn't have changed the data", originalData, postRewriteData);
assertThat(postRewriteData).isEqualTo(originalData);

Copy link
Contributor

@nastra nastra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with some nits that would be good to address

@manuzhang manuzhang force-pushed the partial-progress-max-failed-commits branch from 8069af1 to d13f167 Compare April 12, 2024 15:06
@aokolnychyi
Copy link
Contributor

Looks like there are some related test failures:

TestRewriteDataFilesAction > testParallelPartialProgressWithMaxFailedCommits() FAILED
    org.opentest4j.AssertionFailedError: [We shouldn't have changed the data] 

@aokolnychyi
Copy link
Contributor

I think we should continue to use assertEquals method (which is our custom assert method). It has proper value equality for Spark.

@manuzhang manuzhang force-pushed the partial-progress-max-failed-commits branch from d13f167 to 0c42809 Compare April 13, 2024 14:48
@aokolnychyi aokolnychyi merged commit 78e8204 into apache:main Apr 15, 2024
@aokolnychyi
Copy link
Contributor

Thanks, @manuzhang! Thanks for reviewing, @nastra @RussellSpitzer!

@manuzhang manuzhang deleted the partial-progress-max-failed-commits branch June 4, 2024 11:07
sasankpagolu pushed a commit to sasankpagolu/iceberg that referenced this pull request Oct 27, 2024
zachdisc pushed a commit to zachdisc/iceberg that referenced this pull request Dec 23, 2024
nastra added a commit to nastra/iceberg that referenced this pull request Mar 24, 2025
nastra added a commit to nastra/iceberg that referenced this pull request Mar 24, 2025
nastra added a commit to nastra/iceberg that referenced this pull request Mar 24, 2025
amogh-jahagirdar pushed a commit that referenced this pull request Mar 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants