Skip to content

[Feature Request] Auto Select Ordinals cardinality collector for high cardinality queries #19260

@anandpatel9998

Description

@anandpatel9998

Is your feature request related to a problem? Please describe

Current cardinality aggregator logic selects DirectCollector over OrdinalsCollector when relative memory overhead due to OrdinalsCollector (compared to DirectCollector) is higher. Because of this relative memory consumption logic, DirectCollector is selected for high cardinality aggregation queries. DirectCollector is slower compared to OrdinalsCollector. This default selection leads to higher search latency even when Opensearch process have available memory to use ordinals collector for faster query performance.

Describe the solution you'd like

Ideally, aggregator could be decided based on available memory vs required memory. If required memory is <x% of available memory, use OrdinalsCollector. As per my understanding, Opensearch does not have any metric on available heap after GC. Since, we do not have available memory, we can use total memory as proxy metric and select ordinals collector if required memory is x% of total memory.

Related component

Search:Aggregations

Describe alternatives you've considered

As an alternative solution, execution hint was added as input parameter where customer can pass hint to use Ordinals Collector. But this has two disadvantages

  1. execution hint needs to be decided by customer.
  2. SQL plugin queries does not have ability to pass such input

Another alternative solution was to always use Ordinals Collector. But that will not be feasible when number of buckets are very high. With higher bucket count (default max limit of 65k), number of buckets will be very high such that total required memory for query may exceed available memory.

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Search:AggregationsdiscussIssues intended to help drive brainstorming and decision makingenhancementEnhancement or improvement to existing feature or request

    Type

    No type

    Projects

    Status

    🆕 New

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions