-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Description
I've created a script that crawls the OpenSearch Jenkins builds to find test failures, but only for the Gradle checks that run on code after it is pushed to the main branch. This filters out failures that are due to unmerged code in work-in-progress PRs.
I've included below the output after crawling 2000 recent builds (approx. Oct 16 - Nov 14). This data is very hard to follow, but one thing in particular stands out: SearchQueryIT.testCommonTermsQuery is a frequently failing test, but only since build 29184 (Oct 28). There are no failures before that, which strongly suggests something was changed around Oct 28 that introduced the flakiness. I haven't started to look but I suspect we'll be able to find the cause pretty quickly given that there is a point in time to start looking at. Update Nov 16: the root cause was an unrelated change for concurrent search randomly increased the number of deleted documents and exposed some underlying brittleness in this test: #11233 Diagnosing the root cause was a bit tricky and required diving into the specifics of how the common terms query works, but it was indeed much simpler once the flakiness was correlated to a small date range and then a specific commit.
Surely there are better tools for visualizing test reports over time, perhaps already built into Jenkins? Also, we don't push that many commits so the sample size on builds after pushes to main isn't that large. Something like a nightly job to run the test suite 10 or 50 or 100 times and create a report on failures would help to quickly surface newly introduced flakiness.
$ ruby ~/flaky-test-finder-push-trigger-main.rb -s 27990 -e 29990
24 org.opensearch.indices.replication.SegmentReplicationIT.testSendCorruptBytesToReplica (28239,28239,28239,28239,28645,28645,28645,28645,28702,28702,28702,28702,28875,28875,28875,28875,28894,28894,28894,28894,28897,28897,28897,28897)
17 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/20_response_filtering/Nodes Stats with response filtering} (28276,28276,28276,28276,28278,28278,28278,28278,28765,28962,28962,28962,28962,28989,28989,28989,28989)
16 org.opensearch.repositories.s3.S3BlobStoreRepositoryTests.testRequestStats (28259,28259,28259,28259,28276,28276,28276,28276,28316,28316,28316,28316,28368,28368,28368,28368)
12 org.opensearch.search.aggregations.metrics.CardinalityWithRequestBreakerIT.testRequestBreaker {p0={"search.concurrent_segment_search.enabled":"true"}} (28051,28184,28251,28481,28502,28576,28727,28765,28766,28797,28841,28894)
9 org.opensearch.cluster.MinimumClusterManagerNodesIT.testThreeNodesNoClusterManagerBlock (28051,28576,28702,28713,28875,28897,29428,29666,29846)
9 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=cat.nodes/10_basic/Test cat nodes output} (28276,28276,28276,28276,28278,28278,28278,28278,28765)
9 org.opensearch.index.shard.RemoteIndexShardTests.classMethod (28716,28716,28897,28897,28966,28966,29666,29666,29666)
8 org.opensearch.search.aggregations.metrics.CardinalityWithRequestBreakerIT.testRequestBreaker {p0={"search.concurrent_segment_search.enabled":"false"}} (28051,28481,28576,28765,28766,28797,28841,28894)
7 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=cat.nodes/10_basic/Additional disk information} (28276,28276,28276,28276,28278,28278,28765)
7 org.opensearch.search.query.SearchQueryIT.testCommonTermsQuery {p0={"search.concurrent_segment_search.enabled":"true"}} (29184,29324,29343,29378,29506,29846,29954)
7 org.opensearch.search.query.SearchQueryIT.testCommonTermsQuery {p0={"search.concurrent_segment_search.enabled":"false"}} (29184,29324,29343,29378,29506,29846,29954)
6 org.opensearch.search.aggregations.metrics.CardinalityWithRequestBreakerIT.classMethod (28797,28797,28797,28841,28841,28841)
6 org.opensearch.cluster.service.MasterServiceTests.testClusterStateBatchedUpdates (28899,28905,28966,28989,28994,29003)
5 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - _all} (28765,28989,28989,28989,28989)
5 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/20_response_filtering/Nodes Stats filtered using both includes and excludes filters} (28278,28278,28278,28278,28989)
5 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/30_discovery/Discovery stats} (28765,28962,28966,28989,28989)
5 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=cat.allocation/10_basic/Node ID} (28276,28276,28276,28276,28278)
4 org.opensearch.cluster.MinimumClusterManagerNodesIT.classMethod (28897,28897,28897,28897)
4 org.opensearch.action.admin.cluster.node.tasks.ResourceAwareTasksTests.testTaskResourceTrackingDuringTaskCancellation (28765,28766,29432,29508)
3 org.opensearch.index.shard.RemoteIndexShardTests.testSegRepSucceedsOnPreviousCopiedFiles (28716,28897,28966)
3 org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreDisruptionIT.testCancelReplicationWhileFetchingMetadata (29070,29132,29274)
3 org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreDisruptionIT.classMethod (29070,29132,29378)
3 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - blank} (28278,28765,28962)
3 org.opensearch.remotestore.RemoteIndexRecoveryIT.testSnapshotRecovery (28481,29432,29655)
3 org.opensearch.search.SearchWeightedRoutingIT.testMultiGetWithNetworkDisruption_FailOpenEnabled (28502,29561,29666)
3 org.opensearch.indices.replication.SegmentReplicationSuiteIT.testFullRestartDuringReplication (28671,28716,29561)
3 org.opensearch.smoketest.SmokeTestMultiNodeClientYamlTestSuiteIT.test {yaml=pit/10_basic/Delete all} (28702,28875,29132)
3 org.opensearch.search.aggregations.bucket.DiversifiedSamplerIT.testNestedDiversity {p0={"search.concurrent_segment_search.enabled":"true"}} (28706,28727,29343)
3 org.opensearch.search.aggregations.bucket.DiversifiedSamplerIT.testSimpleDiversity {p0={"search.concurrent_segment_search.enabled":"true"}} (28706,28727,29343)
2 org.opensearch.remotestore.RemoteStoreClusterStateRestoreIT.testFullClusterRestoreGlobalMetadata (29595,29655)
2 org.opensearch.index.shard.RemoteIndexShardTests.testRepicaCleansUpOldCommitsWhenReceivingNew (28239,29293)
2 org.opensearch.indices.replication.SegmentReplicationSuiteIT.classMethod (28716,29561)
2 org.opensearch.search.nested.SimpleNestedIT.testSimpleNestedSortingWithNestedFilterMissing {p0={"search.concurrent_segment_search.enabled":"true"}} (28682,29508)
1 org.opensearch.search.profile.query.QueryProfilerTests.testBasic {p0=5} (29044)
1 org.opensearch.repositories.azure.AzureBlobContainerRetriesTests.testReadBlobWithRetries (29132)
1 org.opensearch.remotestore.RemoteStoreStatsIT.testDownloadStatsCorrectnessSinglePrimaryMultipleReplicaShards (29132)
1 org.opensearch.remotestore.RemoteStoreStatsIT.testNonZeroPrimaryStatsOnNewlyCreatedIndexWithZeroDocs (29132)
1 org.opensearch.index.reindex.ReindexBasicTests.testMultipleSources (29177)
1 org.opensearch.index.reindex.ReindexBasicTests.testFiltering (29177)
1 org.opensearch.repositories.azure.AzureBlobContainerRetriesTests.testReadNonexistentBlobThrowsNoSuchFileException (29184)
1 org.opensearch.action.admin.indices.create.RemoteShrinkIndexIT.testCreateShrinkIndex (29279)
1 org.opensearch.action.admin.indices.create.RemoteShrinkIndexIT.classMethod (29279)
1 org.opensearch.discovery.ClusterDisruptionIT.classMethod (29293)
1 org.opensearch.search.SearchWeightedRoutingIT.testSearchAggregationWithNetworkDisruption_FailOpenEnabled (29293)
1 org.opensearch.repositories.azure.AzureBlobContainerRetriesTests.testReadRangeBlobWithRetries (29324)
1 org.opensearch.monitor.fs.FsHealthServiceTests.testFailsHealthOnHungIOBeyondHealthyTimeout (29324)
1 org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreDisruptionIT.testCancelReplicationWhileSyncingSegments (29378)
1 org.opensearch.search.query.QueryProfilePhaseTests.testTerminateAfterEarlyTermination {p0=5 p1=org.opensearch.search.query.ConcurrentQueryPhaseSearcher@521ba38f} (29417)
1 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.testCreateSplitIndex (29536)
1 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.testCreateSplitIndexToN (29536)
1 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.testSplitFromOneToN (29536)
1 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.testSplitIndexPrimaryTerm (29536)
1 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.classMethod (29536)
1 org.opensearch.search.SearchWeightedRoutingIT.testShardRoutingWithNetworkDisruption_FailOpenEnabled (29595)
1 org.opensearch.index.shard.RemoteIndexShardTests.testSegmentReplication_With_EngineClosedConcurrently (29666)
1 org.opensearch.index.shard.IndexShardTests.testCommitLevelRestoreShardFromRemoteStore (29729)
1 org.opensearch.index.translog.RemoteFsTranslogTests.testMetadataFileDeletion (28027)
1 org.opensearch.search.query.QueryProfilePhaseTests.testTerminateAfterEarlyTermination {p0=5 p1=org.opensearch.search.query.ConcurrentQueryPhaseSearcher@1d1c37d5} (29821)
1 org.opensearch.repositories.azure.AzureBlobContainerRetriesTests.testWriteLargeBlob (28051)
1 org.opensearch.search.query.QueryProfilePhaseTests.testTerminateAfterEarlyTermination {p0=5 p1=org.opensearch.search.query.ConcurrentQueryPhaseSearcher@c83ed77} (28521)
1 org.opensearch.search.SearchTimeoutIT.testSimpleTimeout {p0={"search.concurrent_segment_search.enabled":"false"}} (28576)
1 org.opensearch.remotestore.RemoteStoreStatsIT.testDownloadStatsCorrectnessSinglePrimarySingleReplica (28671)
1 org.opensearch.remotestore.multipart.RemoteStoreMultipartIT.testRestoreSnapshotToIndexWithSameNameDifferentUUID (28706)
1 org.opensearch.search.basic.SearchWithRandomIOExceptionsIT.testRandomDirectoryIOExceptions {p0={"search.concurrent_segment_search.enabled":"true"}} (28706)
1 org.opensearch.search.basic.SearchWithRandomIOExceptionsIT.classMethod (28706)
1 org.opensearch.indices.replication.SegmentReplicationSuiteIT.testBasicReplication (28716)
1 org.opensearch.indices.replication.SegmentReplicationSuiteIT.testDeleteIndexWhileReplicating (28716)
1 org.opensearch.remotestore.RemoteStoreClusterStateRestoreIT.testFullClusterStateRestore (28727)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - indexing doc_status} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/50_indexing_pressure/Indexing pressure stats} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - recovery} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/10_basic/Nodes stats level} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/50_indexing_pressure/Indexing pressure memory limit} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - _all include_segment_file_sizes} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - multi} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - indices _all} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/11_indices_metrics/Metric - one} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=nodes.stats/40_store_stats/Store stats} (28765)
1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=cat.fielddata/10_basic/Test cat fielddata output} (28765)
1 org.opensearch.test.rest.ClientYamlTestSuiteIT.test {p0=search.aggregation/20_terms/string profiler via global ordinals} (28765)
1 org.opensearch.action.bulk.BulkIntegrationIT.testDeleteIndexWhileIndexing (28797)
1 org.opensearch.action.bulk.BulkIntegrationIT.testBulkWithWriteIndexAndRouting (28797)
1 org.opensearch.action.bulk.BulkIntegrationIT.testDocIdTooLong (28797)
1 org.opensearch.action.bulk.BulkIntegrationIT.testBulkIndexCreatesMapping (28797)
1 org.opensearch.action.bulk.BulkIntegrationIT.testBulkWithGlobalDefaults (28797)
1 org.opensearch.search.functionscore.DecayFunctionScoreIT.classMethod (28813)
1 org.opensearch.snapshots.DedicatedClusterSnapshotRestoreIT.testIndexDeletionDuringSnapshotCreationInQueue (28841)
1 org.opensearch.repositories.azure.AzureBlobStoreRepositoryTests.testMultipleSnapshotAndRollback (28875)
1 org.opensearch.client.PitIT.testDeleteAllAndListAllPits (28899)
1 org.opensearch.repositories.azure.AzureBlobStoreRepositoryTests.testContainerCreationAndDeletion (29044)