Skip to content

[BWC] Ensure 2.x compatibility with Legacy 7.10.x#1902

Merged
dblock merged 4 commits intoopensearch-project:mainfrom
nknize:bwc/fix2xTransportHandshake
Jan 17, 2022
Merged

[BWC] Ensure 2.x compatibility with Legacy 7.10.x#1902
dblock merged 4 commits intoopensearch-project:mainfrom
nknize:bwc/fix2xTransportHandshake

Conversation

@nknize
Copy link
Copy Markdown
Contributor

@nknize nknize commented Jan 14, 2022

This PR fixes TransportHandshaker to send a spoofed Legacy 7.10.2 mincompat
version to ensure OpenSearch 2.x nodes can join a Legacy 7.10.x cluster for
rolling upgrade support. Without this change 7.10.x and OpenSearch 2.x mixed
cluster bwc tests were failing.

@nknize nknize added :test Adding or fixing a test v2.0.0 Version 2.0.0 backwards-compatibility labels Jan 14, 2022
@nknize nknize requested a review from a team as a code owner January 14, 2022 04:31
@opensearch-ci-bot
Copy link
Copy Markdown
Collaborator

Can one of the admins verify this patch?

@opensearch-ci-bot
Copy link
Copy Markdown
Collaborator

❌   Gradle Check failure db4542631a5781ababb07ef25f737cdaf398c4d6
Log 1914

Reports 1914

@opensearch-ci-bot
Copy link
Copy Markdown
Collaborator

❌   Gradle Check failure ad1d9029d41a8ffb381281bc59559e977d8843e2
Log 1916

Reports 1916

@nknize nknize force-pushed the bwc/fix2xTransportHandshake branch from ad1d902 to 48b8bf7 Compare January 14, 2022 05:47
@opensearch-ci-bot
Copy link
Copy Markdown
Collaborator

❌   Gradle Check failure 48b8bf739a04111d8cead96ddb31aefed079f492
Log 1918

Reports 1918

@nknize
Copy link
Copy Markdown
Contributor Author

nknize commented Jan 14, 2022

Last check failed with a non-reproducible test; documenting for posterity:

REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.index.shard.IndexShardIT.testExpectedShardSizeIsPresent" -Dtests.seed=9C6FF4BF9EE79C60 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=ar-LY -Dtests.timezone=Asia/Novokuznetsk -Druntime.java=17
2> java.lang.AssertionError
        at __randomizedtesting.SeedInfo.seed([9C6FF4BF9EE79C60:D0C2765DABB1D7ED]:0)
        at org.junit.Assert.fail(Assert.java:87)
        at org.junit.Assert.assertTrue(Assert.java:42)
        at org.junit.Assert.assertTrue(Assert.java:53)
        at org.opensearch.index.shard.IndexShardIT.testExpectedShardSizeIsPresent(IndexShardIT.java:307)
  1> [2022-01-14T12:58:28,900][INFO ][o.o.i.s.IndexShardIT     ] [testLimitNumberOfRetainedTranslogFiles] before test

Also looks like an unexpected circuit breaker was tripped; likely unrelated to the test failure:

Caused by: org.opensearch.common.breaker.CircuitBreakingException: [parent] Data too large, data for [<agg [foo_terms]>] would be [6948/6.7kb], which is larger than the limit of [1024/1kb], usages [request=5120/5kb, fielddata=0/0b, in_flight_requests=0/0b, accounting=1828/1.7kb]
  1> 	at org.opensearch.indices.breaker.HierarchyCircuitBreakerService.checkParentLimit(HierarchyCircuitBreakerService.java:484) ~[main/:?]
  1> 	at org.opensearch.common.breaker.ChildMemoryCircuitBreaker.addEstimateBytesAndMaybeBreak(ChildMemoryCircuitBreaker.java:133) ~[main/:?]
  1> 	at org.opensearch.search.aggregations.AggregatorBase.addRequestCircuitBreakerBytes(AggregatorBase.java:167) ~[main/:?]
  1> 	at org.opensearch.search.aggregations.AggregatorBase.<init>(AggregatorBase.java:129) ~[main/:?]
  1> 	at org.opensearch.search.aggregations.bucket.BucketsAggregator.<init>(BucketsAggregator.java:78) ~[main/:?]
  1> 	at org.opensearch.search.aggregations.bucket.DeferableBucketAggregator.<init>(DeferableBucketAggregator.java:64) ~[main/:?]
  1> 	at org.opensearch.search.aggregations.bucket.terms.TermsAggregator.<init>(TermsAggregator.java:206) ~[main/:?]
  1> 	at org.opensearch.search.aggregations.bucket.terms.AbstractStringTermsAggregator.<init>(AbstractStringTermsAggregator.java:65) ~[main/:?]
  1> 	at org.opensearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator.<init>(GlobalOrdinalsStringTermsAggregator.java:112) ~[main/:?]
  1> 	at org.opensearch.search.aggregations.bucket.terms.TermsAggregatorFactory$ExecutionMode$2.create(TermsAggregatorFactory.java:487) ~[main/:?]
  1> 	at org.opensearch.search.aggregations.bucket.terms.TermsAggregatorFactory$1.build(TermsAggregatorFactory.java:135) ~[main/:?]
  1> 	at org.opensearch.search.aggregations.bucket.terms.TermsAggregatorFactory.doCreateInternal(TermsAggregatorFactory.java:306) ~[main/:?]
  1> 	at org.opensearch.search.aggregations.support.ValuesSourceAggregatorFactory.createInternal(ValuesSourceAggregatorFactory.java:71) ~[main/:?]
  1> 	at org.opensearch.search.aggregations.AggregatorFactory.create(AggregatorFactory.java:96) ~[main/:?]
  1> 	at org.opensearch.search.aggregations.AggregatorFactories.createTopLevelAggregators(AggregatorFactories.java:276) ~[main/:?]
  1> 	at org.opensearch.search.aggregations.AggregationPhase.preProcess(AggregationPhase.java:63) ~[main/:?]
  1> 	at org.opensearch.search.query.QueryPhase.execute(QueryPhase.java:161) ~[main/:?]

@nknize
Copy link
Copy Markdown
Contributor Author

nknize commented Jan 14, 2022

Pushed commits to the development branch do not seem to be updating this PR. :trollface:

@opensearch-ci-bot
Copy link
Copy Markdown
Collaborator

❌   Gradle Check failure bc945e4d9cd8648305c374a7ec89c33d9ef67caa
Log 1922

Reports 1922

@nknize
Copy link
Copy Markdown
Contributor Author

nknize commented Jan 14, 2022

Another failure that can't be reproduced! (╯°□°)╯︵ ┻━┻

REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.cluster.routing.allocation.decider.MockDiskUsagesIT.testRerouteOccursOnDiskPassingHighWatermark" -Dtests.seed=994DB46D3A71E388 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=pt -Dtests.timezone=Asia/Tel_Aviv -Druntime.java=17

Looks like a node timeout issue at MockDiskUsagesIT.java#L166

1> [2022-01-14T19:03:26,620][WARN ][o.o.c.NodeConnectionsService] [node_t1] failed to connect to {node_t0}{C3fT4Fp9SjmjuepSij5_0Q}{IqnPkDZ7TZu52WOm_HWKOA}{127.0.0.1}{127.0.0.1:43583}{dimr}{shard_indexing_pressure_enabled=true} (tried [1] times)
  1> org.opensearch.transport.ConnectTransportException: [node_t0][127.0.0.1:43583] connect_exception

Gave up after one try... valiant effort (。々°)

@dblock
Copy link
Copy Markdown
Member

dblock commented Jan 14, 2022

testRerouteOccursOnDiskPassingHighWatermark

This is a new one, open an issue, link back to #1715

@reta
Copy link
Copy Markdown
Contributor

reta commented Jan 14, 2022

Uh ... Never seen MockDiskUsagesIT failing ... yet ...

@nknize
Copy link
Copy Markdown
Contributor Author

nknize commented Jan 14, 2022

I've never seen a PR not update after pushing to the upstream branch... yet. (⊙_◎)

@nknize
Copy link
Copy Markdown
Contributor Author

nknize commented Jan 14, 2022

Opened an issue regarding the node timeout.

I'll give some time for the internet to reboot and re-fire gradle check

This commit fixes TransportHandshaker to send a spoofed Legacy 7.10.2 mincompat
version to ensure OpenSearch 2.x nodes can join a Legacy 7.10.x cluster for
rolling upgrade support. Without this change 7.10.x and OpenSearch 2.x mixed
cluster bwc tests would fail.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
@nknize nknize force-pushed the bwc/fix2xTransportHandshake branch from 78625ee to 2601e64 Compare January 14, 2022 19:03
@nknize
Copy link
Copy Markdown
Contributor Author

nknize commented Jan 14, 2022

The internet rebooted successfully and latest commits synced. Welcome back from the coma, github

@opensearch-ci-bot
Copy link
Copy Markdown
Collaborator

✅   Gradle Check success 2601e64
Log 1927

Reports 1927

@dblock dblock merged commit 81d998d into opensearch-project:main Jan 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 1.x backwards-compatibility pending backport Identifies an issue or PR that still needs to be backported :test Adding or fixing a test v2.0.0 Version 2.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants