[HUDI-1982] Remove unnecessary synchronization#3041
[HUDI-1982] Remove unnecessary synchronization#3041chaplinthink wants to merge 1 commit intoapache:masterfrom
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3041 +/- ##
============================================
+ Coverage 49.86% 55.14% +5.28%
- Complexity 3527 3865 +338
============================================
Files 488 488
Lines 23618 23616 -2
Branches 2528 2528
============================================
+ Hits 11777 13023 +1246
+ Misses 10802 9434 -1368
- Partials 1039 1159 +120
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
| String lastKnownInstantFromClient = | ||
| ctx.queryParam(RemoteHoodieTableFileSystemView.LAST_INSTANT_TS, HoodieTimeline.INVALID_INSTANT_TS); | ||
| SyncableFileSystemView view = viewManager.getFileSystemView(basePath); | ||
| synchronized (view) { |
There was a problem hiding this comment.
hi, @chaplinthink I think it is necessary to use synchronization to sync view locally since the handler would handle different request from clients concurrently. cc @bvaradar
There was a problem hiding this comment.
I mean the implementation of view.sync(); already has WriteLock to handle multiple requests from clients concurrently @leesf
try {
writeLock.lock();
runSync(oldTimeline, newTimeline);
} finally {
writeLock.unlock();
}
There was a problem hiding this comment.
@chaplinthink Thanks for the explanation, make sense to me. @vinothchandar @bvaradar do you have any other concern?
There was a problem hiding this comment.
I was mulling about the reloading of timeline that happens before the write lock.
@Override
public void sync() {
HoodieTimeline oldTimeline = getTimeline();
HoodieTimeline newTimeline = metaClient.reloadActiveTimeline().filterCompletedAndCompactionInstants();
try {
writeLock.lock();
runSync(oldTimeline, newTimeline);
} finally {
writeLock.unlock();
}
}
runSync() actually could init/reassign metaClient, so in theory removing synchornized could in theory make it non-serializable.
I would suggest that we either move the timeline reload into the write lock and leave this as-is. Whatever we change, we need to validate with more concurrent testing. So not sure if this is all worth the trouble.
Are you hitting real concurrency bottlenecks around this?
There was a problem hiding this comment.
Thanks a lot for reply. Do you mean the synchornized is to ensure HoodieTimeline newTimeline = metaClient.reloadActiveTimeline().filterCompletedAndCompactionInstants(); concurrently?
In fact, we are also to do this right ?
@Override
public void sync() {
HoodieTimeline oldTimeline = getTimeline();
try {
writeLock.lock();
HoodieTimeline newTimeline = metaClient.reloadActiveTimeline().filterCompletedAndCompactionInstants();
runSync(oldTimeline, newTimeline);
} finally {
writeLock.unlock();
}
}
I am confused when I see the code that we use synchornized and writeLock at the same time.
I agree to validate this with more concurrent testing. Currently i have not encountered concurrency bottlenecks.
|
@yihua : As you had fixed this issue in master, this PR can be closed. right ? |
Yes. @chaplinthink #8079 has simplified the synchronization with the fix to |
Tips
What is the purpose of the pull request
synchronized is not necessary, because the sync operation already has WriteLock to ensure synchronization
Brief change log
(for example:)
Verify this pull request
(Please pick either of the following options)
This pull request is a trivial rework / code cleanup without any test coverage.
(or)
This pull request is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(example:)
Committer checklist
Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.