Skip to content

[Feature Request] Star tree validations #15491

@bharath-techie

Description

@bharath-techie

Is your feature request related to a problem? Please describe

We are adding following blocks when user configures star tree index

  • Unsigned long currently is not supported as it has special comparator logic which is not handled currently . For more see : [Star tree] Handle 'unsigned long' as part of star tree #15231

  • We need to limit the maximum number of base metrics with 2.17 experimental release.

  • For documents with array values - currently star tree index cannot handle such cases, since 'star' property does not get satisfied.

  • Since flush is heavy for star tree index, limiting the maximum number of documents during flush will help with the indexing throughput.

Example

Dimension fields
Timestamp
Status

Metric fields
Size

Document 1 :
{"Timestamp": 1999, "status": [200,300], "size" :1000 }

 Queries for above doc:
1. Count of size = 1
2. Sum of size = 1000
3. Count of size where timestamp = 1999 => 1
4. Sum of size where timestamp = 1999   => 1000

Star tree index :
          Dimensions            |  Metrics
DocId    Timestamp     Status     Sum(Size)    Count(Size)     Correct ?
1         1999          200        1000         1               Yes
2         1999          300        1000         1               Yes
3         *             200        1000         1               Yes
4         *             300        1000         1               Yes
5         1999          *          2000         2               NO
6         *             *          2000         2               NO
    
With the above star tree documents
Queries

1. Count of size
    Since there are no filters , we will query for * , * ==> "Doc ID 6"
   Answer =   2
   Expected = 1
   
2. Sum of size
   Since there are no filters , we will query for * , * ==> "Doc ID 6"
   Answer =   2000
   Expected = 1000
   

3. Count of size where timestamp = 1999
   we will query for 1999 , * ==> "Doc ID 5"
   Answer =   2
   Expected = 1
   
4. Sum of size where timestamp = 1999
   we will query for 1999 , * ==> "Doc ID 5"
   Answer =   2
   Expected = 1

Describe the solution you'd like

  • We will block unsigned long as part of star tree mapping as part of star tree dimensions and metrics
  • We will block documents with array values for index with star tree index enabled [ block during bulk / indexing ]
  • We will to limit the maximum number of base metrics to 100 with 2.17 experimental release.
  • Limit the index.translog.flush_threshold_size to maximum of 512 mb which can be configured via another final setting indices.composite_index.translog.max_flush_threshold_size

Related component

Indexing:Performance

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions