Skip to content

Implement logic for storing fields that are neither dimensions nor metrics (aka tags)#87929

Merged
salvatore-campagna merged 94 commits intoelastic:mainfrom
salvatore-campagna:feature/downsampling-labels
Jul 25, 2022
Merged

Implement logic for storing fields that are neither dimensions nor metrics (aka tags)#87929
salvatore-campagna merged 94 commits intoelastic:mainfrom
salvatore-campagna:feature/downsampling-labels

Conversation

@salvatore-campagna
Copy link
Copy Markdown
Contributor

We want labels to be collected storing just the latest value
when aggregating documents in a rollup index. This means
labels will behave similarly to counter metrics.

@salvatore-campagna salvatore-campagna marked this pull request as draft June 22, 2022 17:45
@salvatore-campagna salvatore-campagna added :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) labels Jun 22, 2022
@salvatore-campagna
Copy link
Copy Markdown
Contributor Author

salvatore-campagna commented Jun 23, 2022

elasticsearch-ci/part-2 is failing at the moment for a mapping issue involving IPv4 addresses which are handled as keywords. The following line return format.format(values.nextValue()); is failing because the byte array includes non-UTF8 characters like, for instance, '0xFF' which are used to encode an IPv4 and IPv6 address. To be more precise the RawDocValueFormat format is used when formatting the value instead of using the appropriate IpDocValueFormat. As a result this is a mapping mistake. NOTE: if the type of the field is set to keyword instead of ip the test is green.

@sethmlarson sethmlarson added the Team:Clients Meta label for clients team label Jul 19, 2022
@elasticsearchmachine elasticsearchmachine removed the Team:Clients Meta label for clients team label Jul 19, 2022
@sethmlarson sethmlarson added the Team:Clients Meta label for clients team label Jul 19, 2022
@elasticsearchmachine elasticsearchmachine removed the Team:Clients Meta label for clients team label Jul 19, 2022
@sethmlarson sethmlarson added the Team:Clients Meta label for clients team label Jul 20, 2022
@elasticsearchmachine elasticsearchmachine removed the Team:Clients Meta label for clients team label Jul 20, 2022
@sethmlarson sethmlarson added the Team:Clients Meta label for clients team label Jul 20, 2022
@elasticsearchmachine elasticsearchmachine removed the Team:Clients Meta label for clients team label Jul 20, 2022
@salvatore-campagna
Copy link
Copy Markdown
Contributor Author

@elasticmachine run elasticsearch-ci/part-2

Copy link
Copy Markdown
Contributor

@csoulios csoulios left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating on this.
I left some more comments.

- skip:
version: " - 8.2.99"
version: " - 8.3.99"
reason: tsdb indexing changed in 8.3.0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reason should be something like "rollup: labels support added in 8.4.0"

if (validTypes.contains(value.getClass()) == false) {
throw new IllegalArgumentException(
"Expected [" + VALID_TYPES + "] for field [" + field + "], " + "got [" + value.getClass() + "]"
"Expected [" + VALID_METRIC_TYPES + "] for field [" + field + "], " + "got [" + value.getClass() + "]"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should VALID_METRIC_TYPES be validTypes?

final Object[] value = fieldValues.apply(docValueCount);
if (metricFieldProducers.containsKey(field)) {
// TODO: missing support for array metrics
collectMetric(field, value[0]);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the TODO: missing support for array metrics comment here. Is this something you plan to add in this PR or in a following PR?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, because I think that needs to decide first how do we handle that which I don't have any idea at the moment.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should work in the same way as aggregations work on multi-value fields.

final MappedFieldType fieldType = mapperService.mappingLookup().getFieldType(field);
return fieldType != null
&& (timestampField.equals(field) == false)
&& (fieldType.isAggregatable())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why must a label field be aggregatable?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is basically to avoid fields of type text which is not aggregatable...while it would work with fields of type keyword which are aggregatable.

Copy link
Copy Markdown
Contributor Author

@salvatore-campagna salvatore-campagna Jul 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remove that condition the tests result in an errors saying something like "you need to use aggregatable fields or use <some API I don't remember> to support non-aggregatable fields

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool thanks for the explanation. I had missed that

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually means that we cannot have a text field as label?

@salvatore-campagna
Copy link
Copy Markdown
Contributor Author

@elasticmachine update branch

@elasticsearchmachine elasticsearchmachine changed the base branch from master to main July 22, 2022 23:06
@salvatore-campagna
Copy link
Copy Markdown
Contributor Author

@elasticmachine update branch

@elasticmachine
Copy link
Copy Markdown
Collaborator

merge conflict between base and head

Copy link
Copy Markdown
Contributor

@csoulios csoulios left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Thank you for iterating on this

@salvatore-campagna
Copy link
Copy Markdown
Contributor Author

LGTM!

Thank you for iterating on this

Are we still waiting for something before being able to merge or can I proceed?

@csoulios
Copy link
Copy Markdown
Contributor

Are we still waiting for something before being able to merge or can I proceed?

Please go ahead and merge it.

We have identified 2 TODO tasks that can be implemented in follow-up PRs:

  1. text fields as labels
  2. Computing metric aggs on array fields

@salvatore-campagna
Copy link
Copy Markdown
Contributor Author

Are we still waiting for something before being able to merge or can I proceed?

Please go ahead and merge it.

We have identified 2 TODO tasks that can be implemented in follow-up PRs:

  1. text fields as labels
  2. Computing metric aggs on array fields

I will create two tasks in #74660 to track them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>non-issue :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data :StorageEngine/TSDB You know, for Metrics Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants