Add log for ShardLockException in InternalTestCluster#13632
Add log for ShardLockException in InternalTestCluster#13632Hailong-am wants to merge 5 commits intoopensearch-project:mainfrom
Conversation
Signed-off-by: Hailong Cui <ihailong@amazon.com>
|
❌ Gradle check result for c6960c2: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
| try { | ||
| env.shardLock(id, "InternalTestCluster assert after test", TimeUnit.SECONDS.toMillis(5)).close(); | ||
| } catch (ShardLockObtainFailedException ex) { | ||
| logger.error("Obtained shard lock failed", ex); |
There was a problem hiding this comment.
The below fail function should effectively fail the tests with the needed logging already?
There was a problem hiding this comment.
fail function will log something like java.lang.AssertionError: Shard [.plugins-ml-config][0] is still locked after 5 sec waiting, it will have shard information, but lack of why it still been locked and who hold the lock? that's the main purpose of add this log. The exception will have these information.
org.opensearch.env.ShardLockObtainFailedException: [.plugins-ml-config][0]: obtaining shard lock for [InternalTestCluster assert after test] timed out after [5000ms], lock already held for [starting shard] with age [20365ms]
There was a problem hiding this comment.
Maybe improve the fail logging instead?
There was a problem hiding this comment.
sounds good, changed to improve existing fail assertion message.
gaobinlong
left a comment
There was a problem hiding this comment.
LGTM, you may merge the latest main branch into your branch to rerun the gradle checks, make sure all checks can pass.
|
❌ Gradle check result for 300ecc9: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Hailong Cui <ihailong@amazon.com>
|
❌ Gradle check result for 35df6aa: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #13632 +/- ##
============================================
+ Coverage 71.42% 71.56% +0.14%
- Complexity 59978 61158 +1180
============================================
Files 4985 5059 +74
Lines 282275 287522 +5247
Branches 40946 41646 +700
============================================
+ Hits 201603 205773 +4170
- Misses 63999 64792 +793
- Partials 16673 16957 +284 ☔ View full report in Codecov by Sentry. |
|
This PR is stalled because it has been open for 30 days with no activity. |
|
This PR is stalled because it has been open for 30 days with no activity. |
Description
log out shardLockException for InternalTestCluster, it will helps to troubleshot failed IT test cases.
Related Issues
#13628
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.