Skip to content

HotToWarmTieringService changes to tier shards#14891

Closed
neetikasinghal wants to merge 1 commit intoopensearch-project:mainfrom
neetikasinghal:tiering-service
Closed

HotToWarmTieringService changes to tier shards#14891
neetikasinghal wants to merge 1 commit intoopensearch-project:mainfrom
neetikasinghal:tiering-service

Conversation

@neetikasinghal
Copy link
Copy Markdown
Contributor

@neetikasinghal neetikasinghal commented Jul 23, 2024

Description

Related Issues

#14545
#13980

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 2e9f80d: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 052d551: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 0813dac: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@neetikasinghal neetikasinghal added backport 2.x Backport to 2.x branch v2.16.0 Issues and PRs related to version 2.16.0 release labels Jul 23, 2024
@github-actions
Copy link
Copy Markdown
Contributor

✅ Gradle check result for 8975e97: SUCCESS

@neetikasinghal neetikasinghal force-pushed the tiering-service branch 2 times, most recently from d7bffe7 to 4774394 Compare July 26, 2024 23:39
@github-actions
Copy link
Copy Markdown
Contributor

✅ Gradle check result for d7bffe7: SUCCESS

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 4774394: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Copy Markdown
Contributor

@jed326 jed326 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a changelog entry but otherwise LGTM

@ExperimentalApi
public class TieringRequestContext {
private final ActionListener<HotToWarmTieringResponse> actionListener;
private final Map<Index, IndexTieringInfo> indexTieringStatusMap;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about just keeping 2 set for accepted and completed indices and 1 map for failedIndices. That way you can keep the indices in respective data structures and don't have to do filtering every time for indices in specific state.

Copy link
Copy Markdown
Contributor Author

@neetikasinghal neetikasinghal Aug 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about the above as well. However, the current approach is cleaner in that the entries need not be moved from accepted set to completed/failed indices, the entry is only in one of the states of tiering (keeping only one source of truth).
Also, later if we plan to extend the tiering states, TieringRequestContext can be easily extensible for different states of tiering. In any case if there is another transition state introduced for another type of tiering, we would need to introduce another set whereas in current way, we just need to add a state to IndexTieringState.
I would like to keep it as is unless you have a strong opinion here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, the current approach is cleaner in that the entries need not be moved from accepted set to completed/failed indices, the entry is only in one of the states of tiering (keeping only one source of truth)

I don't see issue with moving from one set to another. Each data structure is providing an easy way to get the indices in that state which is what most of the calls from H2WTieringService is and hence the suggestion. We still have single source of truth which is TieringRequestContext object that encapsulates these different data structures to maintain indices in different states instead of a single map.

Also, later if we plan to extend the tiering states, TieringRequestContext can be easily extensible for different states of tiering. In any case if there is another transition state introduced for another type of tiering, we would need to introduce another set whereas in current way, we just need to add a state to IndexTieringState

Agree on this but don't see any other tiering state at the moment. Also TieringRequestContext is tied to HotToWarmMigration so if we have to reuse or introduce any new state for a different tiering type, then refactoring will be needed anyways. We can always think about the better mechanism when the use case with other tiering types are known.

Copy link
Copy Markdown
Contributor Author

@neetikasinghal neetikasinghal Aug 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, the current approach is cleaner in that the entries need not be moved from accepted set to completed/failed indices, the entry is only in one of the states of tiering (keeping only one source of truth)

I don't see issue with moving from one set to another. Each data structure is providing an easy way to get the indices in that state which is what most of the calls from H2WTieringService is and hence the suggestion. We still have single source of truth which is TieringRequestContext object that encapsulates these different data structures to maintain indices in different states instead of a single map.

I agree that for a given request we have a single source of truth which is TieringRequestContext. However, to figure out the state of the index (accepted/completed/failed), we would have different sources of truth.
Given that we would have a limited number of indices that would undergo tiering at a given time, I see that the filtering operation would be a constant time operation. What is the other concern that you see with the current implementation?
Also with sets approach - we would need 3 sets and one map here - accepted, successful, completed, failed as compared to what is maintained as a single map in the current implementation.

Also, later if we plan to extend the tiering states, TieringRequestContext can be easily extensible for different states of tiering. In any case if there is another transition state introduced for another type of tiering, we would need to introduce another set whereas in current way, we just need to add a state to IndexTieringState

Agree on this but don't see any other tiering state at the moment. Also TieringRequestContext is tied to HotToWarmMigration so if we have to reuse or introduce any new state for a different tiering type, then refactoring will be needed anyways. We can always think about the better mechanism when the use case with other tiering types are known.

makes sense.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Aug 7, 2024

✅ Gradle check result for 7da1aa6: SUCCESS

@codecov
Copy link
Copy Markdown

codecov bot commented Aug 7, 2024

Codecov Report

❌ Patch coverage is 27.80488% with 148 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.86%. Comparing base (97c1bf0) to head (d99f55f).
⚠️ Report is 1888 commits behind head on main.

Files with missing lines Patch % Lines
...earch/indices/tiering/HotToWarmTieringService.java 20.58% 106 Missing and 2 partials ⚠️
...n/admin/indices/tiering/TieringRequestContext.java 0.00% 24 Missing ⚠️
...dices/tiering/TransportHotToWarmTieringAction.java 46.15% 7 Missing ⚠️
...ices/tiering/TieringUpdateClusterStateRequest.java 0.00% 6 Missing ⚠️
...rch/action/admin/indices/tiering/TieringUtils.java 83.33% 1 Missing ⚠️
...org/opensearch/cluster/metadata/IndexMetadata.java 66.66% 1 Missing ⚠️
...earch/indices/tiering/TieringRequestValidator.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #14891      +/-   ##
============================================
+ Coverage     71.74%   71.86%   +0.12%     
- Complexity    62904    62963      +59     
============================================
  Files          5178     5182       +4     
  Lines        295167   295359     +192     
  Branches      42679    42701      +22     
============================================
+ Hits         211774   212268     +494     
+ Misses        66011    65663     -348     
- Partials      17382    17428      +46     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

@sohami sohami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also look into increasing the code coverage, seems pretty low right now. Lets aim to keep it above 80%.

final TieringUpdateClusterStateRequest updateClusterStateRequest = new TieringUpdateClusterStateRequest(
tieringValidationResult.getRejectedIndices(),
request.waitForCompletion()
).ackTimeout(request.timeout())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is this ackTimeout used ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +122 to +126
}

public void markTiered() {
this.state = IndexTieringState.TIERED;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like a state machine, do we need to validate the transitions are happening in the correct sequence?

Signed-off-by: Neetika Singhal <neetiks@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Aug 8, 2024

✅ Gradle check result for d99f55f: SUCCESS

void processTieringRequestContexts(final ClusterState clusterState) {
final Map<Index, TieringRequestContext> tieredIndices = new HashMap<>();
for (TieringRequestContext tieringRequestContext : tieringRequestContexts) {
if (tieringRequestContext.isRequestProcessingComplete()) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this check handles the cases where as indices are failed as part of the request

@opensearch-trigger-bot
Copy link
Copy Markdown
Contributor

This PR is stalled because it has been open for 30 days with no activity.

@opensearch-trigger-bot opensearch-trigger-bot bot added stalled Issues that have stalled and removed stalled Issues that have stalled labels Sep 9, 2024
@opensearch-trigger-bot
Copy link
Copy Markdown
Contributor

This PR is stalled because it has been open for 30 days with no activity.

@opensearch-trigger-bot opensearch-trigger-bot bot added the stalled Issues that have stalled label Oct 13, 2024
@dbwiddis
Copy link
Copy Markdown
Member

@neetikasinghal Are you still working on this? Looks like we need a few merge conflicts addressed. Otherwise is it ready for review?

@opensearch-trigger-bot opensearch-trigger-bot bot removed the stalled Issues that have stalled label Oct 20, 2024
@andrross
Copy link
Copy Markdown
Member

andrross commented Mar 5, 2026

Closing due to age. @neetikasinghal please let me know if you intend to continue on this.

@andrross andrross closed this Mar 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 2.x Backport to 2.x branch backport 2.16 release v2.16.0 Issues and PRs related to version 2.16.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants