[Feature Request] Set `terminate_after` to `trackTotalHitsUpTo` for speedup in boolean queries with only filter clauses

### Is your feature request related to a problem? Please describe

**Background**

This idea is vaguely inspired by Lucene's `ImpactsDISI` + `MaxScoreCache`. On eligible queries, Lucene allows doc id set iterators to start skipping blocks of docs that have non-competitive scores, once the collector for the query already has `trackTotalHitsUpTo` hits (10k by default). This is acceptable since after `trackTotalHitsUpTo` hits, we've given up counting the total number of hits, and there's no reason to check if non-competitive docs match the query at all. If docs have equal scores, ties are broken by picking the one with lower doc id, which means later docs can safely be skipped. 

This only applies in certain cases (exactly 1 `must` clause which uses impacts, plus stuff about its cost relative to other clauses) but when it works it can have huge performance improvements since it can let you skip *most* documents. On simple 2-clause http_logs query this can be as much as a 90x speedup (2494 -> 28 ms). 
Example query: 
```
"bool": {
  "filter": {
    "match": {
      "status": "200"
    }
  },
  "must": {
    "match": {
      "request": "images"
    }
  }
}
```

**Idea**

Lucene does **not** skip non-competitive docs in an all-`filter` case. (It's using `ScoreMode.COMPLETE_NO_SCORES` instead of `ScoreMode.TOP_SCORES`). So, the above query but with both clauses being `filter` can take much longer than 28 ms. 

But, in a filter clause, all documents are equally competitive, since scores don't matter. Like before, we break ties by choosing the doc with lower doc id, and we don't care about counting exact hits past `trackTotalHitsUpTo`. So, as I understand it, there's no benefit in continuing to iterate past `trackTotalHitsUpTo` matching documents. If we stopped early we should get similar speedups that `ImpactsDISI` gives us. 

(This doesn't apply if we're doing aggs, pagination, sorting, or other similar things, but for a simple query I think it's correct). 

I think this speedup would apply to a much broader class of boolean queries than `ImpactsDISI` especially since we're planning on automatically rewriting certain must clauses --> filter clauses as described in https://github.com/opensearch-project/OpenSearch/issues/17586. 

### Describe the solution you'd like

We already have a mechanism to stop iteration early after a certain number of collected docs: the query param `terminate_after`. This lives in the `SearchContext`. 

Ideally there would be some way for a top-level boolean query to signal to the `SearchContext` to set this value to `trackTotalHitsUpTo` if, after rewriting is done, there are only filter clauses. (I think this should also still work if there are must_not clauses). This should be possible at least in theory since `QueryBuilder` rewriting and conversion to `Query` is completed before we construct the `Collector`, which is the thing that needs to accept `terminate_after`. 

I ran a benchmark testing the same all-filter queries with and without `terminate_after` and found some good speedups. Details in the Additional Context section. 

### Related component

Search:Performance

### Describe alternatives you've considered

This could also be implemented at the Lucene level, but it might be tricker since I think there's no shared SearchContext type of object, and it seems tricky to get queries to talk to the parts of the code building the collectors. 

### Additional context

Benchmark results on http_logs are below. I didn't implement this suggestion yet, I'm just running some all-filter queries in OSB ([branch link](https://github.com/peteralfonsi/opensearch-benchmark-workloads/tree/terminate-after-testing)) and explicitly setting `terminate_after` on some of them. I also ran some where `terminate_after` is 1k instead of the default 10k to see if that'd have a significantly larger impact than 10k. 

| Query | Current p50 (ms) | p50 with terminate_after=10k | Speedup | p50 with terminate_after=1k | Speedup |
|--|--|--|--|--|--|
| `status` matches 200 & `request` matches "images" | 851 | 12.6 | 68x | 10.5 | 81x | 
| `status` matches 404 & `request` matches "images" | 40.7 | 15.5 | 2.6x | 12.4 | 3.3x | 
| `@timestamp` between 6/10-6/13 & `request` matches "images" | 454 | 10.9 | 42x | 10.6 | 43x | 
| `request.raw` in list of top 10 terms & `@timestamp` between 6/10-6/13 | 216 | 9.2 | 23x | 8.1 | 27x | 

We can see the speedup is greatest when the queries matched a lot of docs, like status=200. 

Using 1k instead of the default 10k helps a bit but clearly the main effect is from enabling `terminate_after` at all. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Set `terminate_after` to `trackTotalHitsUpTo` for speedup in boolean queries with only filter clauses #18510

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Related component

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Query	Current p50 (ms)	p50 with terminate_after=10k	Speedup	p50 with terminate_after=1k	Speedup
`status` matches 200 & `request` matches "images"	851	12.6	68x	10.5	81x
`status` matches 404 & `request` matches "images"	40.7	15.5	2.6x	12.4	3.3x
`@timestamp` between 6/10-6/13 & `request` matches "images"	454	10.9	42x	10.6	43x
`request.raw` in list of top 10 terms & `@timestamp` between 6/10-6/13	216	9.2	23x	8.1	27x

[Feature Request] Set terminate_after to trackTotalHitsUpTo for speedup in boolean queries with only filter clauses #18510

Description

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Related component

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Feature Request] Set `terminate_after` to `trackTotalHitsUpTo` for speedup in boolean queries with only filter clauses #18510