Background
For a long time elasticsearch has been very permissive about JSON documents and has made no distinction between single values and arrays of values. This permissive approach has several downsides:
- Client code and scripts are made more complex. To be robust, code must be written to handle both single-valued fields and arrays of fields.
- Kibana does some strange things. e.g. Kibana will happily try "AND" multiple values from a bar chart/pie chart which never makes sense for values taken from a single-valued field. This produces no matches because no document can be
OS:ios and OS:android simultaneously
- Administrators cannot easily "lock down" the mapping. Custom ingest scripts are required to prevent multi-valued documents being added (and ingest scripts can still be circumvented by clients sending documents?).
All of the above is unfortunate because the majority of fields in common use are single-valued. A weblog's fields are a good example (timestamp, IP, OS, user agent, URL, referrer, country etc are all single values).
Proposed changes
The solution is a 2-pronged approach :
Enforcement: for new indices we can give administrators the option of rejecting documents with multiple-values.
Reporting: for both new and old indices we can report if the index contains only documents with single values
Background
For a long time elasticsearch has been very permissive about JSON documents and has made no distinction between single values and arrays of values. This permissive approach has several downsides:
OS:iosandOS:androidsimultaneouslyAll of the above is unfortunate because the majority of fields in common use are single-valued. A weblog's fields are a good example (timestamp, IP, OS, user agent, URL, referrer, country etc are all single values).
Proposed changes
The solution is a 2-pronged approach :
Enforcement: for new indices we can give administrators the option of rejecting documents with multiple-values.
Reporting: for both new and old indices we can report if the index contains only documents with single values
is_single_valuedflag to field caps output which indicates if all documents have single values for a field Field caps api - report back if fields are single-valued or not. #80730boolean allowsMultipleValues()method to FieldMapper and remove existing validation code in single-valued fields that is slow. The DocumentParser class should instead assume responsibility for checking single-valued fields don't receive multiple valuesallow_multiple_valuesflag to field mappings that can reject documents presenting arrays New field mapping flag - allow_multiple_values #80289allow_multiple_valuesfield mapping is set and we know this is enforced at ingest timeallow_multiple_valuesis set to false (using NumericDocValuesField instead of SortedNumericDocValuesField and SortedDocValuesField instead of SortedSetDocValuesField)is_single_valuedfeedback in field-caps (e.g. not ANDing values from this field in filter pills). Mention of related progress here