Skip to content

Composite aggs seems to sort too slowly with filter queries #70035

@benwtrent

Description

@benwtrent

Piggy-backing off of previous work: #28745

During the work in #69970 some troubling performance data has reared its ugly head.

Given the following query:

{"bool":{"filter":[{"term":{"event.dataset":"nginx.access"}}]}}

The following composite agg moves at an almost glacial pace:

"aggs": {
    "buckets": {
      "composite": {
        "size": 1000,
        "sources": [
          {
            "date": {
              "date_histogram": {
                "field": "@timestamp",
                "fixed_interval": "15m"
              }
            }
          },
          {
            "source.address": {
              "terms": {
                "field": "source.address"
              }
            }
          }
        ]
      },
      "aggregations": {
        "@timestamp": {
          "max": {
            "field": "@timestamp"
          }
        }
      }
    }
  }

Here are some doc stats:

total_hits: 14479391
cardinality(source.address): 851502
max_timestamp: "2017-03-11T23:59:56.537Z"
min_timestamp: "2017-02-01T00:00:00.189Z"

In datafeeds we "chunk" through when scrolling through data. Consequently, we hit every document and make multiple queries. This is because sorting by timestamp can be costly when hitting many docs.

So, our scrolling datafeed had the following performance:

search_count | 16,649
bucket_count | 935
average_search_time_per_bucket_ms | 81.901
~4.5 ms per search (bucket_count * average_search_time_per_bucket_ms)/search_count

Job finished in ~6 minutes

Doing composite agg without chunking:
🐌 🐌 🐌

search_count | 3,795
bucket_count | 935
average_search_time_per_bucket_ms | 2,705.224
~666.5 ms per search

🐌 🐌 🐌
job finished in 40+ mintes

It seems to me that the composite agg is doing WAY too much work. I think it may be sorting WAY too many documents given the sources.

As an experiment, I added some time based query chunking in 25264688ms intervals (calculated based on term cardinality, count, and total time range)
🔥 🔥 🔥

search_count | 4,124
bucket_count | 935
average_search_time_per_bucket_ms | 112.775
~25 ms per search 

🔥 🔥 🔥
Job finished in ~4 minutes

Datafeeds (and transforms) will ALWAYS be a filter based query (ignoring scores). These queries are user provided, so they could definitely be anything. But it seems to me that there is still room for improvement in the composite agg.

Metadata

Metadata

Assignees

No one assigned

    Labels

    >enhancementTeam:AnalyticsMeta label for analytical engine team (ESQL/Aggs/Geo)

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions