Skip to content

[Change Proposal] Support customization of routing_path #424

@jsoriano

Description

@jsoriano

In #357 we added a flag to enable TSDB, this sets index.mode: "time_series". This mode requires to configure a routing_path, when it isn't, it uses the configured keyword dimensions.

When collecting data from unknown sources, the dimensions may not be known beforehand, so it is not possible to rely on the default behavior. For these cases it is possible to use wildcards in the routing path, and dynamic mappings, to configure the dimensions.

For example for prometheus, it would be possible to configure an index template with the following settings, to use prometheus.labels.* as routing path, and as dimensions:

{
  ...
  "template": {
    "settings": {
      "index.mode": "time_series",
      "index.routing_path": [
        "prometheus.labels.*"
      ]
    },
    "mappings": {
      "dynamic_templates": [
        {
          "labels": {
            "path_match": "prometheus.labels.*",
            "mapping": {
              "type": "keyword",
              "time_series_dimension": true
            }
          }
        },

To support this, the package spec needs to expose some way to configure the routing path. There are some options for this:

  • Option 1: Directly configure it as an elasticsearch setting, this would allow to provide any configuration for this setting, but would complicate validation of fields.
    elasticsearch:
      index_mode: "time_series"
      routing_path: ["prometheus.labels.*"]
    
  • Option 2: Configure it as a setting in the field definition. This would be less flexible, but safer for validations.
    - name: prometheus.labels.*
      type: keyword
      routing_path: true
      dimension: true
    
  • Option 3: Implement both previous options, what would be more flexible, but the less safe option regarding validations.
  • Option 4: Don't add an explicit setting for the routing_path, and configure all dimensions as routing paths. This is what Elasticsearch would do in any case, but we would also include the dynamic mappings with their wildcards.

If options 1-3 are chosen, we also need to decide what to do with other dimensions when configuring routing_path. Without configured routing path, keyword dimensions would be used as routing_path, but if a routing path is added, should other dimensions be also used as routing path? There are two options:

  • Option 1: If a routing path is configured, use only these fields, and not the dimensions fields.
  • Option 2: If a routing path is configured, combine them with the rest of dimension fields.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Team:EcosystemLabel for the Packages Ecosystem teamdiscussIssue needs discussion

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions