Skip to content

Performance degradation in OpenSearch 1.1.0 due to Lucene 8.9.0 #2820

@rtiruveedulaarkin

Description

@rtiruveedulaarkin

Is your feature request related to a problem? Please describe.
We are planning to upgrade from Elasticsearch 7.7.1 to OpenSearch 1.2.4 release. We have compared the performance of OpenSearch 1.2.4 with Elasticsearch 7.7.1. For cardinality queries (with keyword fields), the performance is degraded by 50%. So we couldn't upgrade to OpenSearch.

The performance degradation is observed from in OpenSearch 1.1.0 release onwards.

Below is the code snippet which is running slow in OpenSearch 1.1.0.

    // with opensearch 1.0.1: 240 requests/second
    // with opensearch 1.1.0:  97 requests/second
    public static SearchRequestBuilder getSearchRequest1(TransportClient client, String index, String randomValue) {
        QueryBuilder qb = QueryBuilders.boolQuery().mustNot(QueryBuilders.termQuery("__id.keyword", randomValue));
        CardinalityAggregationBuilder agg = AggregationBuilders
                .cardinality("somename")
                .field("__id.keyword");
        return client.prepareSearch(index).setQuery(qb).addAggregation(agg);
    }

This degradation is caused due to lucene upgrade from 8.8.2 to 8.9.0. The commit is e153629

Lucene developer said this is caused due to an enhancement in Lucene and it has to be fixed in OpenSearch.

Lucene ticket that I have filed: https://issues.apache.org/jira/browse/LUCENE-10509

Describe the solution you'd like

As per Lucene developer (Adrien Grand):

The cardinality aggregation performs value lookups on each document. OpenSearch should change the way cardinality aggregations run to collect matching ordinals into a bitset first, and only look up values once the entire segment has been collected. This should address the performance problem and will likely make the cardinality aggregation faster than it was before Lucene 8.9.

Describe alternatives you've considered
None

Additional context

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions