-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Closed
Closed
Copy link
Labels
Search:AggregationsbugSomething isn't workingSomething isn't workingenhancementEnhancement or improvement to existing feature or requestEnhancement or improvement to existing feature or requestv2.16.0Issues and PRs related to version 2.16.0Issues and PRs related to version 2.16.0
Description
Describe the bug
Provided you have a long field on your index, with extreme min and max values for it, when attempting to return a histogram aggregation on that field using a small interval value, the node instance crashes with OOM.
Related component
Search:Aggregations
To Reproduce
-
Use the default docker-compose provided on the OS site (it's using
:latest, which at the time of writing is2.15.0) -
Add 2 documents
curl -k -XPUT -u "admin:$OPENSEARCH_INITIAL_ADMIN_PASSWORD" \
'https://localhost:9200/sample-index/_doc/1' \
-H 'Content-Type: application/json' \
-d '{"some_value": 1}'
curl -k -XPUT -u "admin:$OPENSEARCH_INITIAL_ADMIN_PASSWORD" \
'https://localhost:9200/sample-index/_doc/2' \
-H 'Content-Type: application/json' \
-d '{"some_value": 1234567890}'- Attempt a histogram with a sufficiently large
interval
curl -k -XGET -u "admin:$OPENSEARCH_INITIAL_ADMIN_PASSWORD" \
'https://localhost:9200/sample-index/_search' \
-H 'Content-Type: application/json' \
-d '{"size":0, "aggs": { "test": { "histogram": { "field": "some_value", "interval": 300000000 }}}}'- OpenSearch correctly (i think) returns the buckets:
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": null,
"hits": []
},
"aggregations": {
"test": {
"buckets": [
{
"key": 0,
"doc_count": 1
},
{
"key": 300000000,
"doc_count": 0
},
{
"key": 600000000,
"doc_count": 0
},
{
"key": 900000000,
"doc_count": 0
},
{
"key": 1200000000,
"doc_count": 1
}
]
}
}
}- change the interval value to
1000
curl -k -XGET -u "admin:$OPENSEARCH_INITIAL_ADMIN_PASSWORD" \
'https://localhost:9200/sample-index/_search' \
-H 'Content-Type: application/json' \
-d '{"size":0, "aggs": { "test": { "histogram": { "field": "some_value", "interval": 1000 }}}}'- OpenSearch correctly responds with
{
"error": {
"root_cause": [],
"type": "search_phase_execution_exception",
"reason": "",
"phase": "fetch",
"grouped": true,
"failed_shards": [],
"caused_by": {
"type": "too_many_buckets_exception",
"reason": "Trying to create too many buckets. Must be less than or equal to: [65535] but was [1234568]. This limit can be set by changing the [search.max_buckets] cluster level setting.",
"max_buckets": 65535
}
},
"status": 503
}- change the interval to
100
curl -k -XGET -u "admin:$OPENSEARCH_INITIAL_ADMIN_PASSWORD" \
'https://localhost:9200/sample-index/_search' \
-H 'Content-Type: application/json' \
-d '{"size":0, "aggs": { "test": { "histogram": { "field": "some_value", "interval": 100 }}}}'- OpenSearch responds with something like
curl: (56) OpenSSL SSL_read: error:0A000126:SSL routines::unexpected eof while reading, errno 0, becauseopensearch-node1just died:
opensearch-node1 | [2024-06-26T12:12:51,906][INFO ][o.o.m.j.JvmGcMonitorService] [opensearch-node1] [gc][1318] overhead, spent [366ms] collecting in the last [1.1s]
opensearch-node1 | java.lang.OutOfMemoryError: Java heap space
opensearch-node1 | Dumping heap to data/java_pid1.hprof ...
opensearch-node1 | Unable to create data/java_pid1.hprof: File exists
opensearch-node1 | [2024-06-26T12:12:52,440][ERROR][o.o.b.OpenSearchUncaughtExceptionHandler] [opensearch-node1] fatal error in thread [opensearch[opensearch-node1][search][T#24]], exiting
opensearch-node1 | java.lang.OutOfMemoryError: Java heap space
opensearch-node1 | at java.base/java.util.Arrays.copyOf(Arrays.java:3482) ~[?:?]
opensearch-node1 | at java.base/java.util.ArrayList.grow(ArrayList.java:237) ~[?:?]
opensearch-node1 | at java.base/java.util.ArrayList.grow(ArrayList.java:244) ~[?:?]
opensearch-node1 | at java.base/java.util.ArrayList.add(ArrayList.java:515) ~[?:?]
opensearch-node1 | at java.base/java.util.ArrayList$ListItr.add(ArrayList.java:1150) ~[?:?]
opensearch-node1 | at org.opensearch.search.aggregations.bucket.histogram.InternalHistogram.addEmptyBuckets(InternalHistogram.java:416) ~[opensearch-2.15.0.jar:2.15.0]
opensearch-node1 | at org.opensearch.search.aggregations.bucket.histogram.InternalHistogram.reduce(InternalHistogram.java:436) ~[opensearch-2.15.0.jar:2.15.0]
opensearch-node1 | at org.opensearch.search.aggregations.InternalAggregations.reduce(InternalAggregations.java:290) ~[opensearch-2.15.0.jar:2.15.0]
opensearch-node1 | at org.opensearch.search.aggregations.InternalAggregations.topLevelReduce(InternalAggregations.java:225) ~[opensearch-2.15.0.jar:2.15.0]
opensearch-node1 | at org.opensearch.action.search.SearchPhaseController.reduceAggs(SearchPhaseController.java:557) ~[opensearch-2.15.0.jar:2.15.0]
opensearch-node1 | at org.opensearch.action.search.SearchPhaseController.reducedQueryPhase(SearchPhaseController.java:528) ~[opensearch-2.15.0.jar:2.15.0]
opensearch-node1 | at org.opensearch.action.search.QueryPhaseResultConsumer.reduce(QueryPhaseResultConsumer.java:153) ~[opensearch-2.15.0.jar:2.15.0]
opensearch-node1 | at org.opensearch.action.search.FetchSearchPhase.innerRun(FetchSearchPhase.java:136) ~[opensearch-2.15.0.jar:2.15.0]
opensearch-node1 | at org.opensearch.action.search.FetchSearchPhase$1.doRun(FetchSearchPhase.java:122) ~[opensearch-2.15.0.jar:2.15.0]
opensearch-node1 | at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) ~[opensearch-2.15.0.jar:2.15.0]
opensearch-node1 | at org.opensearch.threadpool.TaskAwareRunnable.doRun(TaskAwareRunnable.java:78) ~[opensearch-2.15.0.jar:2.15.0]
opensearch-node1 | at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) ~[opensearch-2.15.0.jar:2.15.0]
opensearch-node1 | at org.opensearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:59) ~[opensearch-2.15.0.jar:2.15.0]
opensearch-node1 | at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:941) ~[opensearch-2.15.0.jar:2.15.0]
opensearch-node1 | at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) ~[opensearch-2.15.0.jar:2.15.0]
opensearch-node1 | at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
opensearch-node1 | at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
opensearch-node1 | at java.base/java.lang.Thread.runWith(Thread.java:1596) ~[?:?]
opensearch-node1 | at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
opensearch-node1 | fatal error in thread [opensearch[opensearch-node1][search][T#24]], exiting
Expected behavior
i would have expected (liked, if possible) to get the same too_many_buckets_exception
Additional Details
Plugins
n/a
Screenshots
n/a
Host/Environment (please complete the following information):
- OS: Ubuntu
- Version 22.04.4
Additional context
- the version of the OpenSearch is 2.15.0, made no changes to the docker-compose.yml
Workarounds
- adding
"min_doc_count": 1prevents the crash (and it returns 2 buckets,key: 0andkey: 1234567800); this expects that the clients will have to reconstruct the rest of the empty buckets themselves (not always possible for my particular case, sadly) - changing the heap from
512mto1024m, for example, prevents the crash for"interval": 100", but it crashes for"interval": 10"
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
Search:AggregationsbugSomething isn't workingSomething isn't workingenhancementEnhancement or improvement to existing feature or requestEnhancement or improvement to existing feature or requestv2.16.0Issues and PRs related to version 2.16.0Issues and PRs related to version 2.16.0
Type
Projects
Status
✅ Done
Status
Done