-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Description
Is your feature request related to a problem? Please describe.
We are planning to upgrade from Elasticsearch 7.7.1 to OpenSearch 1.2.4 release. We have compared the performance of OpenSearch 1.2.4 with Elasticsearch 7.7.1. For cardinality queries (with keyword fields), the performance is degraded by 50%. So we couldn't upgrade to OpenSearch.
The performance degradation is observed from in OpenSearch 1.1.0 release onwards.
Below is the code snippet which is running slow in OpenSearch 1.1.0.
// with opensearch 1.0.1: 240 requests/second
// with opensearch 1.1.0: 97 requests/second
public static SearchRequestBuilder getSearchRequest1(TransportClient client, String index, String randomValue) {
QueryBuilder qb = QueryBuilders.boolQuery().mustNot(QueryBuilders.termQuery("__id.keyword", randomValue));
CardinalityAggregationBuilder agg = AggregationBuilders
.cardinality("somename")
.field("__id.keyword");
return client.prepareSearch(index).setQuery(qb).addAggregation(agg);
}
This degradation is caused due to lucene upgrade from 8.8.2 to 8.9.0. The commit is e153629
Lucene developer said this is caused due to an enhancement in Lucene and it has to be fixed in OpenSearch.
Lucene ticket that I have filed: https://issues.apache.org/jira/browse/LUCENE-10509
Describe the solution you'd like
As per Lucene developer (Adrien Grand):
The cardinality aggregation performs value lookups on each document. OpenSearch should change the way cardinality aggregations run to collect matching ordinals into a bitset first, and only look up values once the entire segment has been collected. This should address the performance problem and will likely make the cardinality aggregation faster than it was before Lucene 8.9.
Describe alternatives you've considered
None
Additional context