Skip to content

[Concurrent Segment Search] IT tests failures with concurrent search enabled #7357

@sohami

Description

@sohami

As part of this task we will run all the search IT tests with concurrent search enabled and try enforcing multiple slices to capture different test failures. Based on the findings we will make the changes to fix those. Some of the known ones are:

Fixed:

Cardinality/Nested (Tracked In #8095)

 - org.opensearch.search.aggregations.metrics.CardinalityWithRequestBreakerIT.testRequestBreaker
 - org.opensearch.search.aggregations.metrics.CardinalityWithRequestBreakerIT.classMethod
`
REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.search.aggregations.metrics.CardinalityWithRequestBreakerIT" -Dtests.seed=5A2A89155E5AADBB -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=en-US -Dtests.timezone=UTC -Druntime.java=20
`
- org.opensearch.search.aggregations.metrics.CardinalityIT.testMultiValuedString
`REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.search.aggregations.metrics.CardinalityIT.testMultiValuedString" -Dtests.seed=5A2A89155E5AADBB -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=es-PE -Dtests.timezone=Asia/Qyzylorda -Druntime.java=20`

- org.opensearch.search.aggregations.bucket.NestedIT.testNestedAsSubAggregation
`REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.search.aggregations.bucket.NestedIT.testNestedAsSubAggregation" -Dtests.seed=5A2A89155E5AADBB -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=mk-MK -Dtests.timezone=Europe/Minsk -Druntime.java=20`

Cancellation related failures

- org.opensearch.search.SearchCancellationIT.testCancellationOfScrollSearchesOnFollowupRequests
`
REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.search.SearchCancellationIT.testCancellationOfScrollSearchesOnFollowupRequests" -Dtests.seed=8F971E1485F9561C -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=en-CA -Dtests.timezone=America/Kentucky/Louisville -Druntime.java=20
`
 - org.opensearch.search.SearchCancellationIT.testMSearchChildReqCancellationWithHybridTimeout
`
REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.search.SearchCancellationIT.testMSearchChildReqCancellationWithHybridTimeout" -Dtests.seed=8F971E1485F9561C -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=en-CA -Dtests.timezone=America/Kentucky/Louisville -Druntime.java=20
`
 - org.opensearch.search.SearchCancellationIT.testCancellationOfScrollSearches
`
REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.search.SearchCancellationIT.testCancellationOfScrollSearchesOnFollowupRequests" -Dtests.seed=8F971E1485F9561C -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=en-CA -Dtests.timezone=America/Kentucky/Louisville -Druntime.java=20
`
- org.opensearch.search.SearchTimeoutIT.testSimpleTimeout
`
REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.search.SearchTimeoutIT.testSimpleTimeout" -Dtests.seed=7D0629262BAB81B8 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=sr-Latn-RS -Dtests.timezone=America/Kentucky/Louisville -Druntime.java=20
`
 - org.opensearch.search.SearchCancellationIT.testMSearchChildRequestCancellationWithClusterLevelTimeout
`
REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.search.SearchCancellationIT.testMSearchChildRequestCancellationWithClusterLevelTimeout" -Dtests.seed=8F971E1485F9561C -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=en-CA -Dtests.timezone=America/Kentucky/Louisville -Druntime.java=20
`
 - org.opensearch.search.SearchCancellationIT.testCancellationDuringQueryPhase
`REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.search.SearchCancellationIT.testCancellationDuringQueryPhase" -Dtests.seed=8F971E1485F9561C -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=en-CA -Dtests.timezone=America/Kentucky/Louisville -Druntime.java=20`

- org.opensearch.search.SearchCancellationIT.testCancellationDuringQueryPhaseUsingClusterSetting
`
REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.search.SearchCancellationIT.testCancellationDuringQueryPhaseUsingClusterSetting" -Dtests.seed=8F971E1485F9561C -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=en-CA -Dtests.timezone=America/Kentucky/Louisville -Druntime.java=20
`

- org.opensearch.search.SearchCancellationIT.testCancellationDuringQueryPhaseUsingRequestParameter
`
REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.search.SearchCancellationIT.testCancellationDuringQueryPhaseUsingRequestParameter" -Dtests.seed=8F971E1485F9561C -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=en-CA -Dtests.timezone=America/Kentucky/Louisville -Druntime.java=20
`

SignificantTerms (Tracked in #8509)

 - org.opensearch.search.aggregations.bucket.SignificantTermsSignificanceScoreIT.testBackgroundVsSeparateSet
 - org.opensearch.search.aggregations.bucket.SignificantTermsSignificanceScoreIT.testScoresEqualForPositiveAndNegative
 - org.opensearch.search.aggregations.bucket.SignificantTermsSignificanceScoreIT.testXContentResponse
`
REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.search.aggregations.bucket.TermsShardMinDocCountIT.testShardMinDocCountTermsTest" -Dtests.seed=5A2A89155E5AADBB -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=nb -Dtests.timezone=Africa/Addis_Ababa -Druntime.java=20
`

- org.opensearch.search.ConcurrentSegmentSearchTimeoutIT.testSimpleTimeout

REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.search.ConcurrentSegmentSearchTimeoutIT.testSimpleTimeout" -Dtests.seed=1D45D9B19138B210 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=ro -Dtests.timezone=Africa/Casablanca -Druntime.java=20

org.opensearch.search.ConcurrentSegmentSearchTimeoutIT > testSimpleTimeout FAILED
    java.lang.AssertionError
        at __randomizedtesting.SeedInfo.seed([1D45D9B19138B210:5E1F30E7A748E513]:0)
        at org.junit.Assert.fail(Assert.java:87)
        at org.junit.Assert.assertTrue(Assert.java:42)
        at org.junit.Assert.assertTrue(Assert.java:53)
        at org.opensearch.search.SearchTimeoutIT.testSimpleTimeout(SearchTimeoutIT.java:82)
        at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
        at java.base/java.lang.reflect.Method.invoke(Method.java:578)
        at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
        at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:938)
        at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:974)
        at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:988)
        at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at org.junit.rules.RunRules.evaluate(RunRules.java:20)

org.opensearch.search.SearchTimeoutIT.testSimpleTimeout

REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.search.SearchTimeoutIT.testSimpleTimeout" -Dtests.seed=1D45D9B19138B210 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=el-CY -Dtests.timezone=America/Yellowknife -Druntime.java=20
org.opensearch.search.SearchTimeoutIT > testSimpleTimeout FAILED
    java.lang.AssertionError
        at __randomizedtesting.SeedInfo.seed([1D45D9B19138B210:5E1F30E7A748E513]:0)
        at org.junit.Assert.fail(Assert.java:87)
        at org.junit.Assert.assertTrue(Assert.java:42)
        at org.junit.Assert.assertTrue(Assert.java:53)
        at org.opensearch.search.SearchTimeoutIT.testSimpleTimeout(SearchTimeoutIT.java:82)

Profiler IT (#8801)

- org.opensearch.search.aggregations.bucket.GlobalIT.testWithStatsSubAggregatorAndProfileEnabled

`REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.search.aggregations.bucket.GlobalIT.testWithStatsSubAggregatorAndProfileEnabled" -Dtests.seed=5E3CC4E0C76B87A9 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=fr -Dtests.timezone=America/Guatemala -Druntime.java=20`
        
- org.opensearch.search.profile.aggregation.AggregationProfilerIT
 
` REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.search.profile.aggregation.AggregationProfilerIT" -Dtests.seed=5E3CC4E0C76B87A9 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=en-US -Dtests.timezone=UTC -Druntime.java=20`

Shard Size parameter related failures (Tracked in #8860)

 - org.opensearch.search.aggregations.bucket.TermsShardMinDocCountIT.testShardMinDocCountTermsTest
`
REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.search.aggregations.bucket.TermsShardMinDocCountIT.testShardMinDocCountTermsTest" -Dtests.seed=5A2A89155E5AADBB -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=nb -Dtests.timezone=Africa/Addis_Ababa -Druntime.java=20
`
 - org.opensearch.search.aggregations.bucket.TermsShardMinDocCountIT.testShardMinDocCountSignificantTermsTest
 - org.opensearch.search.aggregations.bucket.ShardSizeTermsIT.testShardSizeEqualsSizeDouble
 - org.opensearch.search.aggregations.bucket.ShardSizeTermsIT.testShardSizeEqualsSizeString

DocCountError: (#9209)

- org.opensearch.search.aggregations.bucket.TermsDocCountErrorIT.testFixedDocs
`./gradlew ':server:internalClusterTest' --tests "org.opensearch.search.aggregations.bucket.TermsDocCountErrorIT.testFixedDocs" -Dtests.seed=E2AE5B9415455EC3 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=lt -Dtests.timezone=Europe/Brussels`

CB failures (#9124 )

- org.opensearch.indices.memory.breaker.CircuitBreakerServiceIT.testMemoryBreaker

2023-07-26T13:52:14.1470413Z   2> REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.indices.memory.breaker.CircuitBreakerServiceIT.testMemoryBreaker" -Dtests.seed=41D55BF4F404C7FC -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=es -Dtests.timezone=Antarctica/Macquarie -Druntime.java=20
2023-07-26T13:52:14.1471308Z   2> java.lang.AssertionError: 
2023-07-26T13:52:14.1471585Z     Expected: <INTERNAL_SERVER_ERROR>
2023-07-26T13:52:14.1471887Z          but: was <TOO_MANY_REQUESTS>
2023-07-26T13:52:14.1472222Z         at __randomizedtesting.SeedInfo.seed([41D55BF4F404C7FC:52BB2E5D942C2472]:0)
2023-07-26T13:52:14.1472664Z         at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
2023-07-26T13:52:14.1473112Z         at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
2023-07-26T13:52:14.1473800Z         at org.opensearch.test.hamcrest.OpenSearchAssertions.assertFailures(OpenSearchAssertions.java:362)
2023-07-26T13:52:14.1474584Z         at org.opensearch.indices.memory.breaker.CircuitBreakerServiceIT.testMemoryBreaker(CircuitBreakerServiceIT.java:175)

- org.opensearch.indices.memory.breaker.CircuitBreakerServiceIT.testRamAccountingTermsEnum

REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.indices.memory.breaker.CircuitBreakerServiceIT.testRamAccountingTermsEnum" -Dtests.seed=41D55BF4F404C7FC -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=es -Dtests.timezone=Antarctica/Macquarie -Druntime.java=20
2023-07-26T13:52:12.8125014Z 
2023-07-26T13:52:12.8125792Z org.opensearch.indices.memory.breaker.CircuitBreakerServiceIT > testRamAccountingTermsEnum FAILED
2023-07-26T13:52:12.8126273Z     java.lang.AssertionError: 
2023-07-26T13:52:12.8126554Z     Expected: <INTERNAL_SERVER_ERROR>
2023-07-26T13:52:12.8126807Z          but: was <TOO_MANY_REQUESTS>
2023-07-26T13:52:12.8127174Z         at __randomizedtesting.SeedInfo.seed([41D55BF4F404C7FC:CB932CD95149F85E]:0)
2023-07-26T13:52:12.8127621Z         at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
2023-07-26T13:52:12.8128072Z         at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
2023-07-26T13:52:12.8128647Z         at org.opensearch.test.hamcrest.OpenSearchAssertions.assertFailures(OpenSearchAssertions.java:362)
2023-07-26T13:52:12.8129495Z         at org.opensearch.indices.memory.breaker.CircuitBreakerServiceIT.testRamAccountingTermsEnum(CircuitBreakerServiceIT.java:232)

Sort (Tracked in #8510)

 - org.opensearch.search.sort.SimpleSortIT.testSimpleSorts
 - org.opensearch.search.sort.FieldSortIT.testScriptFieldSort
`REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.search.sort.FieldSortIT.testScriptFieldSort" -Dtests.seed=7D0629262BAB81B8 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=sr-RS -Dtests.timezone=Asia/Shanghai -Druntime.java=20`


2023-08-07T15:30:35.1768111Z org.opensearch.search.sort.FieldSortIT > testScriptFieldSort FAILED
2023-08-07T15:30:35.1768733Z     java.lang.AssertionError: 
2023-08-07T15:30:35.1769142Z     Expected: <1.0>
2023-08-07T15:30:35.1769583Z          but: was <2.0>
2023-08-07T15:30:35.1770113Z         at __randomizedtesting.SeedInfo.seed([9B3FCFB2B72BF5A1:B6E047C8A36E5DE9]:0)
2023-08-07T15:30:35.1770659Z         at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
2023-08-07T15:30:35.1771399Z         at org.junit.Assert.assertThat(Assert.java:964)
2023-08-07T15:30:35.1771916Z         at org.junit.Assert.assertThat(Assert.java:930)
2023-08-07T15:30:35.1773585Z REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.search.sort.FieldSortIT.testScriptFieldSort" -Dtests.seed=9B3FCFB2B72BF5A1 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=en-MT -Dtests.timezone=America/North_Dakota/Beulah -Druntime.java=20
2023-08-07T15:30:35.1774769Z         at org.opensearch.search.sort.FieldSortIT.testScriptFieldSort(FieldSortIT.java:1993)

parent-join (#9469)

:modules:parent-join:internalClusterTest

 - org.opensearch.join.aggregations.ParentIT.testSimpleParentAgg

REPRODUCE WITH: ./gradlew ':modules:parent-join:internalClusterTest' --tests "org.opensearch.join.aggregations.ParentIT.testSimpleParentAgg" -Dtests.seed=1D45D9B19138B210 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=da -Dtests.timezone=Etc/GMT-13 -Druntime.java=20

- org.opensearch.action.admin.cluster.node.tasks.ConcurrentSearchTasksIT.testConcurrentSearchTaskTracking

> Task :server:internalClusterTest
Suite: Test class org.opensearch.action.admin.cluster.node.tasks.ConcurrentSearchTasksIT
  2> REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.action.admin.cluster.node.tasks.ConcurrentSearchTasksIT.testConcurrentSearchTaskTracking" -Dtests.seed=1D45D9B19138B210 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=tr-TR -Dtests.timezone=Asia/Hovd -Druntime.java=20
  2> java.lang.AssertionError: expected:<2> but was:<7>
        at __randomizedtesting.SeedInfo.seed([1D45D9B19138B210:516860484A1B76D1]:0)
        at org.junit.Assert.fail(Assert.java:89)
        at org.junit.Assert.failNotEquals(Assert.java:835)
        at org.junit.Assert.assertEquals(Assert.java:647)
        at org.junit.Assert.assertEquals(Assert.java:633)
        at org.opensearch.action.admin.cluster.node.tasks.ConcurrentSearchTasksIT.testConcurrentSearchTaskTracking(ConcurrentSearchTasksIT.java:112)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions