Update track-shared-logsdb-mode component template for elastic/logs and elastic/security tracks by martijnvg · Pull Request #1097 · elastic/rally-tracks

martijnvg · 2026-03-19T14:53:52Z

Removed the dependency to index_mode track param to other track params. Which I think was never intended and is undocumented. The used track params in these files are valid outside index_mode.
~~Allow index.use_time_series_doc_values_format_large_binary_block_size also when serverless_operator == true, so that we can see the effect in serverless.~~
Added use_time_series_doc_values_format track param.

…e when serverless_operator == true

martijnvg · 2026-03-19T15:55:30Z

I verified locally that this works by running:

esrally race --track-path=/Users/mvg/dev/code/rally-tracks/elastic/security --preserve-install --on-error=abort --kill-running-processes --pipeline=benchmark-only --target-host=localhost:9200 --track-params="wait_for_status:yellow" --test-mode
esrally race --track-path=/Users/mvg/dev/code/rally-tracks/elastic/logs --preserve-install --on-error=abort --kill-running-processes --challenge=logging-querying --track-params="wait_for_status:yellow, bulk_start_date:2020-01-01, bulk_end_date:2020-01-02, raw_data_volume_per_day:10GB, max_generated_corpus_size:4GB, max_total_download_gb:4, number_of_replicas:0, number_of_shards:1" --pipeline=benchmark-only --target-host=localhost:9200 --test-mode

martijnvg · 2026-03-19T15:56:37Z

elastic/logs/templates/component/track-shared-logsdb-mode.json

-        {% if index_mode %}
        "index": {
-            "mode": {{ index_mode | tojson }}
+            {% if use_doc_values_skipper | default(true) %}


Moving this setting to the top, which avoids the if conditions when the add , in front if each setting. The currrent logic always prints the mapping.use_doc_values_skipper index setting.

OK, that makes sense - but you need to be aware that you would not be able to use any other index settings, if the use_doc_values_skipper were to be set to false. You could extend the endif to the end of index settings, that way an invalid json couldnt be generated - I wonder though if we should try and investigate (outside of this PR), if there's a way we can make this easier - we have similar issues in other tracks, though we tend to get around it by setting the final item in index settings so it always is printed, thus you can include a comma at the end of other lines -> https://github.com/elastic/rally-tracks/blob/master/github_archive/index-template.json#L38

I think I need to update this again, the use_doc_values_skipper isn't a serverless public setting.
I will add logic to conditionally add ,, like in the other file.

gareth-ellis

LGTM - it would be nice if we could find a way to avoid requiring certain parameters to be set in a certain way to ensure we end up with valid json. Theres some CI issues we need to resolve, too, i'll take a quick look and see if I can work out whats going wrong

gareth-ellis · 2026-03-23T09:34:53Z

elastic/logs/templates/component/track-shared-logsdb-mode.json

-        {% if index_mode %}
        "index": {
-            "mode": {{ index_mode | tojson }}
+            {% if use_doc_values_skipper | default(true) %}


OK, that makes sense - but you need to be aware that you would not be able to use any other index settings, if the use_doc_values_skipper were to be set to false. You could extend the endif to the end of index settings, that way an invalid json couldnt be generated - I wonder though if we should try and investigate (outside of this PR), if there's a way we can make this easier - we have similar issues in other tracks, though we tend to get around it by setting the final item in index settings so it always is printed, thus you can include a comma at the end of other lines -> https://github.com/elastic/rally-tracks/blob/master/github_archive/index-template.json#L38

…_serverless

martijnvg · 2026-03-24T19:03:40Z

@gareth-ellis I think I fixed invalid json error issues. However I don't understand why the current ci jobs have failed.

For example:


self = <it_tracks_serverless.test_logs.TestLogs object at 0x7c16c1b92990>, operator = True, rally = <pytest_rally.rally.Rally object at 0x7c16c1bf06e0>
--
project_config = ServerlessProjectConfig(target_host='rally-tracks-it-serverless-3775-ba1c72.es.eu-west-1.aws.qa.elastic.cloud:443', us...er.json'), operator_client_options_file=local('/tmp/pytest-of-buildkite-agent/pytest-0/client-options0/operator.json'))
 
def test_logs_default(self, operator, rally, project_config: ServerlessProjectConfig):
ret = rally.race(
track="elastic/logs",
challenge="logging-indexing",
track_params="number_of_replicas:1",
client_options=project_config.get_client_options_file(operator),
target_hosts=project_config.target_host,
)
>       assert ret == 0
E       assert 64 == 0
 
it_tracks_serverless/test_logs.py:68: AssertionError

gareth-ellis · 2026-03-24T20:32:00Z

You need to scroll up slightly.

An example:

[ERROR] Cannot race. Error in load generator [0]
--
2026-03-24 08:09:07 UTC | Cannot run task [create-all-component-templates]: Request returned an error. Error type: api, Description: illegal_argument_exception ({'error': {'root_cause': [{'type': 'illegal_argument_exception', 'reason': 'unknown setting [index.use_time_series_doc_values_format_large_binary_block_size] did you mean [index.use_time_series_doc_values_format_large_block_size]?'}], 'type': 'illegal_argument_exception', 'reason': 'unknown setting [index.use_time_series_doc_values_format_large_binary_block_size] did you mean [index.use_time_series_doc_values_format_large_block_size]?'}, 'status': 400}), HTTP Status: 400
2026-03-24 08:09:07 UTC |

martijnvg · 2026-03-25T07:39:21Z

Thanks for pointing this out.

So it looks like this setting is unknown in serverless. How does this test verify these templates?
The only limitation is that the setting is currently behind a feature flag. Is for some reason a released serverless started instead of a snapshot?

…_serverless

martijnvg · 2026-03-25T12:46:23Z

@gareth-ellis I've removed the or serverless_operator == true condition again. There are two other improvements in this PR that are still worth getting in.

gareth-ellis

LGTM

martijnvg · 2026-03-26T08:16:12Z

Hey @gareth-ellis, the 3.10 and 3.13 compat pr ci jobs keep failing with:

Error:  Cannot race. Error in load generator [0]
	Cannot run task [compression-stats]: Request returned an error. Error type: transport, Description: network connection timed out

Do you have an idea what this is happening?

gareth-ellis · 2026-03-26T08:30:57Z

It seems to be compression-stats is timing out:

INFO     pytest_rally.rally:rally.py:147 Running command: [esrally race --track="elastic/logs" --challenge="logging-indexing" --track-repository="/home/runner/work/rally-tracks/rally-tracks" --track-revision="de6985c574edc26dcf751baa9b7241c4c8f16d8d" --configuration-name="pytest" --enable-assertions --kill-running-processes --on-error="abort" --pipeline="benchmark-only" --target-hosts="127.0.0.1:19200" --test-mode --track-params="number_of_replicas:0"]
FAILED

Specifically this step:
https://github.com/elastic/rally-tracks/blob/master/elastic/logs/challenges/logging-indexing.json#L26

We probably have a 10 second timeout, could it be the changes in this PR have made that slower? I'll try and reproduce locally and should be able to see what is actually happening

from logs:

026-03-25 19:52:04,131 ActorAddr-(T|:41181)/PID:4399 esrally.driver.driver INFO Worker[0] executing tasks: ['compression-stats']
2026-03-25 19:53:36,600 ActorAddr-(T|:41181)/PID:4399 elastic_transport.node_pool WARNING Node <RallyAiohttpHttpNode(http://127.0.0.1:19200)> has failed for 1 times in a row, putting on 1 second timeout
2026-03-25 19:53:36,601 ActorAddr-(T|:41181)/PID:4399 esrally.driver.driver ERROR Could not execute schedule
Traceback (most recent call last):

  File "/home/runner/.local/share/hatch/env/virtual/rally-tracks/jBbSvtJB/it/lib/python3.10/site-packages/esrally/driver/driver.py", line 1940, in __call__
    total_ops, total_ops_unit, request_meta_data = await execute_single(runner, self.es, params, self.on_error)

  File "/home/runner/.local/share/hatch/env/virtual/rally-tracks/jBbSvtJB/it/lib/python3.10/site-packages/esrally/driver/driver.py", line 2154, in execute_single
    raise exceptions.RallyAssertionError(msg)

esrally.exceptions.RallyAssertionError: Request returned an error. Error type: transport, Description: network connection timed out

2026-03-25 19:53:36,601 ActorAddr-(T|:41181)/PID:4399 esrally.driver.driver INFO Worker[0] finished executing tasks ['compression-stats'] in 92.470416 seconds
2026-03-25 19:53:36,827 ActorAddr-(T|:41181)/PID:4399 esrally.driver.driver ERROR Worker[0] has detected a benchmark failure. Notifying master...
Traceback (most recent call last):

  File "/opt/hostedtoolcache/Python/3.10.20/x64/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)

  File "/home/runner/.local/share/hatch/env/virtual/rally-tracks/jBbSvtJB/it/lib/python3.10/site-packages/esrally/driver/driver.py", line 1785, in __call__
    loop.run_until_complete(self.run())

  File "/opt/hostedtoolcache/Python/3.10.20/x64/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()

  File "/home/runner/.local/share/hatch/env/virtual/rally-tracks/jBbSvtJB/it/lib/python3.10/site-packages/esrally/driver/driver.py", line 1837, in run
    _ = await asyncio.gather(*awaitables)

  File "/home/runner/.local/share/hatch/env/virtual/rally-tracks/jBbSvtJB/it/lib/python3.10/site-packages/esrally/driver/driver.py", line 2002, in __call__
    raise exceptions.RallyError(f"Cannot run task [{self.task}]: {e}") from None

esrally.exceptions.RallyError: Cannot run task [compression-stats]: Request returned an error. Error type: transport, Description: network connection timed out

2026-03-25 19:53:36,828 ActorAddr-(T|:41007)/PID:4377 esrally.driver.driver ERROR Main driver received a fatal exception from a load generator. Shutting down.

The equivalent from a run on master:

2026-03-25 14:36:59,43 ActorAddr-(T|:44733)/PID:4782 esrally.driver.driver INFO Creating time-period based schedule with [None] distribution for [compression-stats] with a warmup period of [0] seconds and a time period of [None] seconds.
2026-03-25 14:36:59,43 ActorAddr-(T|:44733)/PID:4782 esrally.client.factory INFO Creating ES client connected to [{'host': '127.0.0.1', 'port': 19200}] with options [{'timeout': 60}]
2026-03-25 14:36:59,44 ActorAddr-(T|:44733)/PID:4782 esrally.driver.driver INFO Worker[0] executing tasks: ['compression-stats']
2026-03-25 14:37:50,122 ActorAddr-(T|:44733)/PID:4782 esrally.driver.driver INFO Worker[0] finished executing tasks ['compression-stats'] in 51.077770 seconds

It suggests that probably this PR (or something else) has caused the compression stats to take a bit longer - so we now go over the timeout. We were at 51 seconds before for the entire task - that should be three calls I believe.

martijnvg · 2026-03-26T08:41:00Z

We probably have a 10 second timeout, could it be the changes in this PR have made that slower?

The main change is that without index modes other track params can be used as well. But these track params do have to be enabled and even if that is the case I don't see how that would cause time outs, but maybe I miss something here. What is compression-stats actually doing?

gareth-ellis · 2026-03-26T08:41:55Z

This is the compression stats runner: https://github.com/elastic/rally-tracks/blob/master/elastic/shared/runners/datastream.py#L123-L178

martijnvg · 2026-03-26T08:47:22Z

Thanks, looking at the python method, a few ES apis are being invoked. But the error doesn't say with api invocation times out.

Also I think compression-stats task is often excluded from benchmark runs?

gareth-ellis · 2026-03-26T10:01:35Z

I reran locally, from master and then from your branch,then from your branch with a longer timeout:

Master:
2026-03-26 09:05:28,610 ActorAddr-(T|:63819)/PID:70845 elastic_transport.transport INFO POST https://127.0.0.1:9200/logs-k8-application.log-default/_search [status:200 duration:0.344s]
2026-03-26 09:05:29,158 ActorAddr-(T|:63819)/PID:70845 elastic_transport.transport INFO POST https://127.0.0.1:9200/logs-k8-application.log-default/_search [status:200 duration:0.272s]

60s timeout:
2026-03-26 09:14:19,46 ActorAddr-(T|:52366)/PID:76722 elastic_transport.transport INFO POST https://127.0.0.1:9200/logs-k8-application.log-default/_search [status:N/A duration:60.646s]

240s timeout:
2026-03-26 09:37:22,535 ActorAddr-(T|:58252)/PID:82257 elastic_transport.transport INFO POST https://127.0.0.1:9200/logs-k8-application.log-default/_search [status:200 duration:105.167s]
2026-03-26 09:40:45,63 ActorAddr-(T|:58252)/PID:82257 elastic_transport.transport INFO POST https://127.0.0.1:9200/logs-k8-application.log-default/_search [status:200 duration:101.032s

martijnvg · 2026-03-26T10:19:40Z

Thanks @gareth-ellis, then there must be something wrong here :)

I don't know what these rally pr jobs do. Would you be able to share how you reproduced this? I'm curious what the exact search request is and with what settings we run. And does this run against main branch of Elasticsearch or are we running agains older ES version here?

…_serverless

martijnvg · 2026-03-27T18:59:20Z

The performance issue was fixed via elastic/elasticsearch#145077

esbenchmachine · 2026-04-07T06:28:35Z

@martijnvg
A backport is pending for this PR.
Apply all the labels that correspond to Elasticsearch minor versions expected to work with this PR, but select only from the available ones.
If intended for future releases, apply label for next minor

When a vX.Y label is added, a new pull request will be automatically created, unless merge conflicts are detected or if the label supplied points to the next Elasticsearch minor version. If successful, a link to the newly opened backport PR will be provided in a comment.

In case of merge conflicts during backporting, create the backport PR manually following the steps from README:
Final steps to complete the backporting process:

Ensure the correct version labels exist in this PR.
Ensure each backport pull request is labeled with backport.
Review and merge each backport pull request into the appropriate version branch.
Remove backport pending label from this PR once all backport PRs are merged.

Thank you!

martijnvg added 3 commits March 19, 2026 15:52

Enable index.use_time_series_doc_values_format_large_binary_block_siz…

ba974cf

…e when serverless_operator == true

iter

ba98595

fix templates again

4b3b3e7

martijnvg commented Mar 19, 2026

View reviewed changes

martijnvg changed the title ~~Allow index.use_time_series_doc_values_format_large_binary_block_size when serverless_operator == true~~ Update track-shared-logsdb-mode component template for elastic/logs and elastic/security track.s Mar 19, 2026

martijnvg requested a review from gareth-ellis March 19, 2026 16:02

gareth-ellis reviewed Mar 23, 2026

View reviewed changes

martijnvg added 3 commits March 23, 2026 11:41

Merge remote-tracking branch 'es/master' into use_large_binary_blocks…

4c557ed

…_serverless

iter

1ccbd75

iter

da65e02

martijnvg added 3 commits March 25, 2026 13:33

Remove or serverless_operator == true

854573f

Merge remote-tracking branch 'es/master' into use_large_binary_blocks…

fcefba8

…_serverless

add use_time_series_doc_values_format track param

2d2df54

gareth-ellis approved these changes Mar 25, 2026

View reviewed changes

added missing comma

30280a9

Merge remote-tracking branch 'es/master' into use_large_binary_blocks…

8caede3

…_serverless

martijnvg changed the title ~~Update track-shared-logsdb-mode component template for elastic/logs and elastic/security track.s~~ Update track-shared-logsdb-mode component template for elastic/logs and elastic/security tracks Mar 27, 2026

martijnvg merged commit 7fa6019 into elastic:master Mar 31, 2026
15 checks passed

esbenchmachine added the backport pending Awaiting backport to stable release branch label Mar 31, 2026

Conversation

martijnvg commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

martijnvg commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

martijnvg Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

gareth-ellis Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

martijnvg Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

gareth-ellis left a comment

Choose a reason for hiding this comment

Uh oh!

gareth-ellis Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

martijnvg commented Mar 24, 2026

Uh oh!

gareth-ellis commented Mar 24, 2026

Uh oh!

martijnvg commented Mar 25, 2026

Uh oh!

martijnvg commented Mar 25, 2026

Uh oh!

gareth-ellis left a comment

Choose a reason for hiding this comment

Uh oh!

martijnvg commented Mar 26, 2026

Uh oh!

gareth-ellis commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

martijnvg commented Mar 26, 2026

Uh oh!

gareth-ellis commented Mar 26, 2026

Uh oh!

martijnvg commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gareth-ellis commented Mar 26, 2026

Uh oh!

martijnvg commented Mar 26, 2026

Uh oh!

martijnvg commented Mar 27, 2026

Uh oh!

Uh oh!

esbenchmachine commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

martijnvg commented Mar 19, 2026 •

edited

Loading

martijnvg commented Mar 19, 2026 •

edited

Loading

gareth-ellis commented Mar 26, 2026 •

edited

Loading

martijnvg commented Mar 26, 2026 •

edited

Loading