Concurrent Searching

**Is your feature request related to a problem? Please describe.**
At least since Apache Lucene 6.x, there is a new experimental low-level API which allows to parallelize execution of the search across segments [3]. As of latest Apache Lucene 8.10.1, the API is still marked as experimental (see please [1]). The community feedback on this feature is looking positive so far (see please [2]), there are high chances that for certain kind of indices parallelizing the search over segments could bring performance benefits.

[1] https://lucene.apache.org/core/8_10_1/core/org/apache/lucene/search/IndexSearcher.html#search-org.apache.lucene.search.Query-org.apache.lucene.search.CollectorManager-
[2] https://engineeringblog.yelp.com/2021/09/nrtsearch-yelps-fast-scalable-and-cost-effective-search-engine.html
[3] https://blog.mikemccandless.com/2019/10/concurrent-query-execution-in-apache.html

**Describe the solution you'd like**
From the essential parts, since the API is experimental, it should be controlled by the setting and have allocated a dedicated configurable thread pool:
- **"search.allow_concurrent_segment_search"**, default value is **false**
- **"index_searcher"** thread pool (default number of threads == number of cores)

The change, although quite complex, is mostly isolated in the `QueryPhase` and `QueryCollectorContext` (and surrounding classes). 

**Describe alternatives you've considered**
N/A

**Additional context**
Currently, the search implementation implies sequential flow, the results are accumulated by individual collectors (backed by collector contexts) and post processed at the end. It has to be changed to use `CollectorManager`s and reducers instead  to assemble the final query results. 

The impediments: early termination and time-bounded search are **exception  driven**. This is difficult to replicate as-is, in this case the flow is interrupted and the reducers are not available.

It would make sense to come up with the benchmarks to compare the sequential and parallel segment search and have a proof when each of those would be useful. Also, once such proof is collected, the engine itself may provide the hints at runtime to recommend switching the feature on/off (probably, on per-index basis).     

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concurrent Searching #1286

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Concurrent Searching #1286

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions