In #357 we added a flag to enable TSDB, this sets index.mode: "time_series". This mode requires to configure a routing_path, when it isn't, it uses the configured keyword dimensions.
When collecting data from unknown sources, the dimensions may not be known beforehand, so it is not possible to rely on the default behavior. For these cases it is possible to use wildcards in the routing path, and dynamic mappings, to configure the dimensions.
For example for prometheus, it would be possible to configure an index template with the following settings, to use prometheus.labels.* as routing path, and as dimensions:
{
...
"template": {
"settings": {
"index.mode": "time_series",
"index.routing_path": [
"prometheus.labels.*"
]
},
"mappings": {
"dynamic_templates": [
{
"labels": {
"path_match": "prometheus.labels.*",
"mapping": {
"type": "keyword",
"time_series_dimension": true
}
}
},
To support this, the package spec needs to expose some way to configure the routing path. There are some options for this:
- Option 1: Directly configure it as an elasticsearch setting, this would allow to provide any configuration for this setting, but would complicate validation of fields.
elasticsearch:
index_mode: "time_series"
routing_path: ["prometheus.labels.*"]
- Option 2: Configure it as a setting in the field definition. This would be less flexible, but safer for validations.
- name: prometheus.labels.*
type: keyword
routing_path: true
dimension: true
- Option 3: Implement both previous options, what would be more flexible, but the less safe option regarding validations.
- Option 4: Don't add an explicit setting for the routing_path, and configure all dimensions as routing paths. This is what Elasticsearch would do in any case, but we would also include the dynamic mappings with their wildcards.
If options 1-3 are chosen, we also need to decide what to do with other dimensions when configuring routing_path. Without configured routing path, keyword dimensions would be used as routing_path, but if a routing path is added, should other dimensions be also used as routing path? There are two options:
- Option 1: If a routing path is configured, use only these fields, and not the dimensions fields.
- Option 2: If a routing path is configured, combine them with the rest of dimension fields.
In #357 we added a flag to enable TSDB, this sets
index.mode: "time_series". This mode requires to configure arouting_path, when it isn't, it uses the configured keyword dimensions.When collecting data from unknown sources, the dimensions may not be known beforehand, so it is not possible to rely on the default behavior. For these cases it is possible to use wildcards in the routing path, and dynamic mappings, to configure the dimensions.
For example for prometheus, it would be possible to configure an index template with the following settings, to use
prometheus.labels.*as routing path, and as dimensions:To support this, the package spec needs to expose some way to configure the routing path. There are some options for this:
If options 1-3 are chosen, we also need to decide what to do with other dimensions when configuring routing_path. Without configured routing path, keyword dimensions would be used as routing_path, but if a routing path is added, should other dimensions be also used as routing path? There are two options: