Skip to content

Conversation

@zuston
Copy link
Member

@zuston zuston commented Nov 18, 2022

What changes were proposed in this pull request?

  1. Introduce the data length check in LocalOrder strategy
  2. Add tests about keeping consistent with FixedSize Strategy when no skew optimization

Why are the changes needed?

More stability

Does this PR introduce any user-facing change?

No

How was this patch tested?

UTs

@zuston
Copy link
Member Author

zuston commented Nov 18, 2022

PTAL @jerqi

// If ShuffleServer is flushing the file at this time, the length in the index file record may be greater
// than the length in the actual data file, and it needs to be returned at this time to avoid EOFException
if (dataFileLen != -1 && totalLen >= dataFileLen) {
if (dataFileLen != -1 && totalLen > dataFileLen) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we remove the =?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. we need. This has been discussed in #320

@codecov-commenter
Copy link

codecov-commenter commented Nov 18, 2022

Codecov Report

Merging #336 (14c50c6) into master (2df7594) will increase coverage by 0.06%.
The diff coverage is 100.00%.

@@             Coverage Diff              @@
##             master     #336      +/-   ##
============================================
+ Coverage     61.31%   61.37%   +0.06%     
- Complexity     1526     1529       +3     
============================================
  Files           186      186              
  Lines          9438     9445       +7     
  Branches        924      924              
============================================
+ Hits           5787     5797      +10     
+ Misses         3341     3340       -1     
+ Partials        310      308       -2     
Impacted Files Coverage Δ
...iffle/common/segment/FixedSizeSegmentSplitter.java 97.77% <100.00%> (+2.22%) ⬆️
...ffle/common/segment/LocalOrderSegmentSplitter.java 94.11% <100.00%> (+3.49%) ⬆️
...e/spark/shuffle/reader/RssShuffleDataIterator.java 90.90% <0.00%> (+0.36%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@zuston zuston requested a review from jerqi November 18, 2022 05:43
Copy link
Contributor

@jerqi jerqi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @zuston

@jerqi jerqi merged commit 1aa3661 into apache:master Nov 18, 2022
@jerqi
Copy link
Contributor

jerqi commented Nov 21, 2022

There are conflicts between LOCAL_ORDER and getInMemoryShuffleData. getInMemoryShuffleData need that we don't change the order. Because we use lastBlockId to read data.

@zuston
Copy link
Member Author

zuston commented Nov 22, 2022

There are conflicts between LOCAL_ORDER and getInMemoryShuffleData. getInMemoryShuffleData need that we don't change the order. Because we use lastBlockId to read data.

You are right. I will fix it by creating a reference copy to avoid breaking down original sequence.

@zuston
Copy link
Member Author

zuston commented Nov 22, 2022

There are conflicts between LOCAL_ORDER and getInMemoryShuffleData. getInMemoryShuffleData need that we don't change the order. Because we use lastBlockId to read data.

You are right. I will fix it by creating a reference copy to avoid breaking down original sequence.

Fixed #336

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants