Skip to content

[BUG] Missing inner hits in top hits of an aggregation results since upgrade to 2.13.0 #13467

@martijnbolhuis

Description

@martijnbolhuis

Describe the bug

I have query on a nested field names.full_name. I have enabled inner hits on this query. Furthermore, I have added an aggregation on the (non nested) field list_id and I'm using the top hits function to include results per bucket of the aggregation.

In OpenSearch version 2.12.0, the top hits included the inner hits (on names.full_name) but in version 2.13.0 these inner hits are missing.

Related component

Other

To Reproduce

The following script reproduces the problem:

# Create an index and mapping
curl -X DELETE "http://localhost:9200/names-test?pretty"
curl -X PUT "http://localhost:9200/names-test?pretty"  -H 'Content-Type: application/json' -d'
{
  "mappings": {
    "properties": {
      "list_id": {
        "type": "integer"
      },
      "names": {
        "type": "nested",
        "properties": {
          "full_name": {
            "type": "text"
          }
        }
      }
    }
  }
}
'

# Insert documents into the index
curl -X PUT "http://localhost:9200/names-test/_doc/1?refresh&pretty"  -H 'Content-Type: application/json' -d'
{
  "list_id": 1,
  "names": [
    {
      "full_name": "John Doe"
    },
    {
      "full_name": "John Micheal Doe"
    }
  ]
}
'

curl -X PUT "http://localhost:9200/names-test/_doc/2?refresh&pretty"  -H 'Content-Type: application/json' -d'
{
  "list_id": 2,
  "names": [
    {
      "full_name": "Jane Doe"
    },
    {
      "full_name": "Jane Michelle Doe"
    }
  ]
}
'

# Perform a query
curl -X POST "http://localhost:9200/names-test/_search?pretty"  -H 'Content-Type: application/json' -d'
{
  "query": {
    "nested": {
      "path": "names",
      "query": {
        "match": { "names.full_name": "Doe" }
      },
      "inner_hits": {}
    }
  },
  "size": 0,
  "aggs": {
    "lists": {
      "terms": {
        "field": "list_id"
      },
      "aggs": {
        "top_result": {
          "top_hits": {
            "size": 10
          }
        }
      }
    }
  }
}
'

Expected behavior

The following is the expected result which OpenSearch 2.12.0 gives. This includes inner_hits.

{
  "took" : 20,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "lists" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 1,
          "doc_count" : 1,
          "top_result" : {
            "hits" : {
              "total" : {
                "value" : 1,
                "relation" : "eq"
              },
              "max_score" : 0.10607058,
              "hits" : [
                {
                  "_index" : "names-test",
                  "_id" : "1",
                  "_score" : 0.10607058,
                  "_source" : {
                    "list_id" : 1,
                    "names" : [
                      {
                        "full_name" : "John Doe"
                      },
                      {
                        "full_name" : "John Micheal Doe"
                      }
                    ]
                  },
                  "inner_hits" : {
                    "names" : {
                      "hits" : {
                        "total" : {
                          "value" : 2,
                          "relation" : "eq"
                        },
                        "max_score" : 0.11474907,
                        "hits" : [
                          {
                            "_index" : "names-test",
                            "_id" : "1",
                            "_nested" : {
                              "field" : "names",
                              "offset" : 0
                            },
                            "_score" : 0.11474907,
                            "_source" : {
                              "full_name" : "John Doe"
                            }
                          },
                          {
                            "_index" : "names-test",
                            "_id" : "1",
                            "_nested" : {
                              "field" : "names",
                              "offset" : 1
                            },
                            "_score" : 0.09739208,
                            "_source" : {
                              "full_name" : "John Micheal Doe"
                            }
                          }
                        ]
                      }
                    }
                  }
                }
              ]
            }
          }
        },
        {
          "key" : 2,
          "doc_count" : 1,
          "top_result" : {
            "hits" : {
              "total" : {
                "value" : 1,
                "relation" : "eq"
              },
              "max_score" : 0.10607058,
              "hits" : [
                {
                  "_index" : "names-test",
                  "_id" : "2",
                  "_score" : 0.10607058,
                  "_source" : {
                    "list_id" : 2,
                    "names" : [
                      {
                        "full_name" : "Jane Doe"
                      },
                      {
                        "full_name" : "Jane Michelle Doe"
                      }
                    ]
                  },
                  "inner_hits" : {
                    "names" : {
                      "hits" : {
                        "total" : {
                          "value" : 2,
                          "relation" : "eq"
                        },
                        "max_score" : 0.11474907,
                        "hits" : [
                          {
                            "_index" : "names-test",
                            "_id" : "2",
                            "_nested" : {
                              "field" : "names",
                              "offset" : 0
                            },
                            "_score" : 0.11474907,
                            "_source" : {
                              "full_name" : "Jane Doe"
                            }
                          },
                          {
                            "_index" : "names-test",
                            "_id" : "2",
                            "_nested" : {
                              "field" : "names",
                              "offset" : 1
                            },
                            "_score" : 0.09739208,
                            "_source" : {
                              "full_name" : "Jane Michelle Doe"
                            }
                          }
                        ]
                      }
                    }
                  }
                }
              ]
            }
          }
        }
      ]
    }
  }
}

The actual result in 2.13.0 is the following which is missing the inner_hits:

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "lists" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 1,
          "doc_count" : 1,
          "top_result" : {
            "hits" : {
              "total" : {
                "value" : 1,
                "relation" : "eq"
              },
              "max_score" : 0.10607058,
              "hits" : [
                {
                  "_index" : "names-test",
                  "_id" : "1",
                  "_score" : 0.10607058,
                  "_source" : {
                    "list_id" : 1,
                    "names" : [
                      {
                        "full_name" : "John Doe"
                      },
                      {
                        "full_name" : "John Micheal Doe"
                      }
                    ]
                  }
                }
              ]
            }
          }
        },
        {
          "key" : 2,
          "doc_count" : 1,
          "top_result" : {
            "hits" : {
              "total" : {
                "value" : 1,
                "relation" : "eq"
              },
              "max_score" : 0.10607058,
              "hits" : [
                {
                  "_index" : "names-test",
                  "_id" : "2",
                  "_score" : 0.10607058,
                  "_source" : {
                    "list_id" : 2,
                    "names" : [
                      {
                        "full_name" : "Jane Doe"
                      },
                      {
                        "full_name" : "Jane Michelle Doe"
                      }
                    ]
                  }
                }
              ]
            }
          }
        }
      ]
    }
  }
}

Additional Details

Plugins
No plugins / default opensearch installation

Host/Environment (please complete the following information):

  • OS: Arch Linux
  • Version latest

I'm using opensearch from docker: https://hub.docker.com/layers/opensearchproject/opensearch/2.13.0/images/sha256-00f052502297cbc599af34b93605e1eb485438f0e9670dc8d82a4976da7d3feb?context=explore

Additional context

I set "size": 0 in the main query because I'm not interested in the "regular" hits but only in the aggregated top hits. If I change this to for example "size": 100, the "regular" hits will include the inner hits so there it works as expected. But I do this, the top hits still will not include the inner hits.

Metadata

Metadata

Assignees

Type

No type

Projects

Status

✅ Done

Status

Planned work items

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions