CCS: don't proxy requests for already connected node#31273
Merged
javanna merged 4 commits intoelastic:masterfrom Jun 13, 2018
Merged
CCS: don't proxy requests for already connected node#31273javanna merged 4 commits intoelastic:masterfrom
javanna merged 4 commits intoelastic:masterfrom
Conversation
Cross-cluster search selects a subset of nodes for each remote cluster and sends requests only to them, which will act as a proxy and properly redirect such requests to the target nodes that hold the relevant data. What happens today is that every time we send a request to a remote cluster, it will be sent to the next node in the proxy list (in round-robin fashion), regardless of whether the target node is already amongst the ones that we are connected to. In case for instance we need to send a shard search request to a data node that's also one of the selected proxy nodes, we may end up sending the request to it through one of the other proxy nodes. This commit optimizes this case to make sure that whenever we are already connected to a remote node, we will send a direct request rather than using the next proxy node. There is a side-effect to this, which is that round-robin will be a bit unbalanced as the data nodes that are also selected as proxies will receive more requests.
Collaborator
|
Pinging @elastic/es-search-aggs |
s1monw
approved these changes
Jun 12, 2018
Contributor
s1monw
left a comment
There was a problem hiding this comment.
left some suggestions LGTM
| */ | ||
| Transport.Connection getConnection(DiscoveryNode remoteClusterNode) { | ||
| DiscoveryNode discoveryNode = connectedNodes.get(); | ||
| if (connectedNodes.contains(remoteClusterNode)) { |
Contributor
There was a problem hiding this comment.
should we use transportService .nodeConnected instead. I mean if we for instance have the local cluster configured as a remote cluster we get this optimization as well?
Contributor
Author
There was a problem hiding this comment.
makes sense, great catch!
|
|
||
| @Override | ||
| public void sendRequest(long requestId, String action, TransportRequest request, TransportRequestOptions options) | ||
| static class ProxyConnection implements Transport.Connection { |
| } | ||
|
|
||
| private static class ConnectedNodes implements Supplier<DiscoveryNode> { | ||
| private static class ConnectedNodes { |
jasontedor
added a commit
to nik9000/elasticsearch
that referenced
this pull request
Jun 14, 2018
* elastic/master: (40 commits) [DOC] Extend SQL docs Immediately flush channel after writing to buffer (elastic#31301) [DOCS] Shortens ML API intros Use quotes in the call invocation (elastic#31249) move security ingest processors to a sub ingest directory (elastic#31306) Add 5.6.11 version constant. Fix version detection. SQL: Whitelist SQL utility class for better scripting (elastic#30681) [Docs] All Rollup docs experimental, agg limitations, clarify DeleteJob (elastic#31299) CCS: don't proxy requests for already connected node (elastic#31273) Mute ScriptedMetricAggregatorTests testSelfReferencingAggStateAfterMap [test] opensuse packaging turn up debug logging Add unreleased version 6.3.1 Removes experimental tag from scripted_metric aggregation (elastic#31298) [Rollup] Metric config parser must use builder so validation runs (elastic#31159) [ML] Check licence when datafeeds use cross cluster search (elastic#31247) Add notion of internal index settings (elastic#31286) Test: Remove broken yml test feature (elastic#31255) REST hl client: cluster health to default to cluster level (elastic#31268) [ML] Update test thresholds to account for changes to memory control (elastic#31289) ...
jasontedor
added a commit
to majormoses/elasticsearch
that referenced
this pull request
Jun 14, 2018
* elastic/master: (29 commits) [DOC] Extend SQL docs Immediately flush channel after writing to buffer (elastic#31301) [DOCS] Shortens ML API intros Use quotes in the call invocation (elastic#31249) move security ingest processors to a sub ingest directory (elastic#31306) Add 5.6.11 version constant. Fix version detection. SQL: Whitelist SQL utility class for better scripting (elastic#30681) [Docs] All Rollup docs experimental, agg limitations, clarify DeleteJob (elastic#31299) CCS: don't proxy requests for already connected node (elastic#31273) Mute ScriptedMetricAggregatorTests testSelfReferencingAggStateAfterMap [test] opensuse packaging turn up debug logging Add unreleased version 6.3.1 Removes experimental tag from scripted_metric aggregation (elastic#31298) [Rollup] Metric config parser must use builder so validation runs (elastic#31159) [ML] Check licence when datafeeds use cross cluster search (elastic#31247) Add notion of internal index settings (elastic#31286) Test: Remove broken yml test feature (elastic#31255) REST hl client: cluster health to default to cluster level (elastic#31268) [ML] Update test thresholds to account for changes to memory control (elastic#31289) ...
dnhatn
added a commit
that referenced
this pull request
Jun 14, 2018
* master: Remove RestGetAllAliasesAction (#31308) Temporary fix for broken build Reenable Checkstyle's unused import rule (#31270) Remove remaining unused imports before merging #31270 Fix non-REST doc snippet [DOC] Extend SQL docs Immediately flush channel after writing to buffer (#31301) [DOCS] Shortens ML API intros Use quotes in the call invocation (#31249) move security ingest processors to a sub ingest directory (#31306) Add 5.6.11 version constant. Fix version detection. SQL: Whitelist SQL utility class for better scripting (#30681) [Docs] All Rollup docs experimental, agg limitations, clarify DeleteJob (#31299) CCS: don't proxy requests for already connected node (#31273) Mute ScriptedMetricAggregatorTests testSelfReferencingAggStateAfterMap [test] opensuse packaging turn up debug logging Add unreleased version 6.3.1 Removes experimental tag from scripted_metric aggregation (#31298) [Rollup] Metric config parser must use builder so validation runs (#31159) [ML] Check licence when datafeeds use cross cluster search (#31247) Add notion of internal index settings (#31286) Test: Remove broken yml test feature (#31255) REST hl client: cluster health to default to cluster level (#31268) [ML] Update test thresholds to account for changes to memory control (#31289) Log warnings when cluster state publication failed to some nodes (#31233) Fix AntFixture waiting condition (#31272) Ignore numeric shard count if waiting for ALL (#31265) [ML] Implement new rules design (#31110) index_prefixes back-compat should test 6.3 (#30951) Core: Remove plain execute method on TransportAction (#30998) Update checkstyle to 8.10.1 (#31269) Set analyzer version in PreBuiltAnalyzerProviderFactory (#31202) Modify pipelining handlers to require full requests (#31280) Revert upgrade to Netty 4.1.25.Final (#31282) Use armored input stream for reading public key (#31229) Fix Netty 4 Server Transport tests. Again. REST hl client: adjust wait_for_active_shards param in cluster health (#31266) REST high-level Client: remove deprecated API methods (#31200) [DOCS] Mark SQL feature as experimental [DOCS] Updates machine learning custom URL screenshots (#31222) Fix naming conventions check for XPackTestCase Fix security Netty 4 transport tests Fix race in clear scroll (#31259) [DOCS] Clarify audit index settings when remote indexing (#30923) Delete typos in SAML docs (#31199) REST high-level client: add Cluster Health API (#29331) [ML][TEST] Mute tests using rules (#31204) Support RequestedAuthnContext (#31238) SyncedFlushResponse to implement ToXContentObject (#31155) Add Get Aliases API to the high-level REST client (#28799) Remove some line length supressions (#31209) Validate xContentType in PutWatchRequest. (#31088) [INGEST] Interrupt the current thread if evaluation grok expressions take too long (#31024) Suppress extras FS on caching directory tests Revert "[DOCS] Added 6.3 info & updated the upgrade table. (#30940)" Revert "Fix snippets in upgrade docs" Fix snippets in upgrade docs [DOCS] Added 6.3 info & updated the upgrade table. (#30940) LLClient: Support host selection (#30523) Upgrade to Netty 4.1.25.Final (#31232) Enable custom credentials for core REST tests (#31235) Move ESIndexLevelReplicationTestCase to test framework (#31243) Encapsulate Translog in Engine (#31220) HLRest: Add get index templates API (#31161) Remove all unused imports and fix CRLF (#31207) [Tests] Fix self-referencing tests [TEST] Fix testRecoveryAfterPrimaryPromotion [Docs] Remove mention pattern files in Grok processor (#31170) Use stronger write-once semantics for Azure repository (#30437) Don't swallow exceptions on replication (#31179) Limit the number of concurrent requests per node (#31206) Call ensureNoSelfReferences() on _agg state variable after scripted metric agg script executions (#31044) Move java version checker back to its own jar (#30708) [test] add fix for rare virtualbox error (#31212)
javanna
added a commit
that referenced
this pull request
Jun 15, 2018
Cross-cluster search selects a subset of nodes for each remote cluster and sends requests only to them, which will act as a proxy and properly redirect such requests to the target nodes that hold the relevant data. What happens today is that every time we send a request to a remote cluster, it will be sent to the next node in the proxy list (in round-robin fashion), regardless of whether the target node is already amongst the ones that we are connected to. In case for instance we need to send a shard search request to a data node that's also one of the selected proxy nodes, we may end up sending the request to it through one of the other proxy nodes. This commit optimizes this case to make sure that whenever we are already connected to a remote node, we will send a direct request rather than using the next proxy node. There is a side-effect to this, which is that round-robin will be a bit unbalanced as the data nodes that are also selected as proxies will receive more requests.
dnhatn
added a commit
that referenced
this pull request
Jun 15, 2018
* 6.x: Upgrade to Lucene-7.4.0-snapshot-518d303506 (#31360) [ML] Implement new rules design (#31110) (#31294) Remove RestGetAllAliasesAction (#31308) CCS: don't proxy requests for already connected node (#31273) Rankeval: Fold template test project into main module (#31203) [Docs] Remove reference to repository-s3 plugin creating an S3 bucket (#31359) More detailed tracing when writing metadata (#31319) Add details section for dcg ranking metric (#31177)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Cross-cluster search selects a subset of nodes for each remote cluster
and sends requests only to them, which will act as a proxy and properly
redirect such requests to the target nodes that hold the relevant data.
What happens today is that every time we send a request to a remote
cluster, it will be sent to the next node in the proxy list
(in round-robin fashion), regardless of whether the target node is
already amongst the ones that we are connected to. In case for instance
we need to send a shard search request to a data node that's also one of
the selected proxy nodes, we may end up sending the request to it
through one of the other proxy nodes.
This commit optimizes this case to make sure that whenever we are
already connected to a remote node, we will send a direct request rather
than using the next proxy node.
There is a side-effect to this, which is that round-robin will be a bit
unbalanced as the data nodes that are also selected as proxies will
receive more requests.