Skip to content

Log the estimate of batch metrics memory consumption#18916

Open
andsel wants to merge 10 commits intoelastic:mainfrom
andsel:feature/estimate_batch_metrics_memory_consumption_and_log
Open

Log the estimate of batch metrics memory consumption#18916
andsel wants to merge 10 commits intoelastic:mainfrom
andsel:feature/estimate_batch_metrics_memory_consumption_and_log

Conversation

@andsel
Copy link
Copy Markdown
Contributor

@andsel andsel commented Mar 30, 2026

Release notes

For each pipeline that has structured batch metrics enabled, log a line with memory consumed to collect such data.

What does this PR do?

  • Updates all the classes that uses Histogram to expose or sumup the memory consumed by those internal structures.
  • Updates pipeline startup to print a log line with that information if batch metrics is enabled.
  • Updates all the existing classes that start a pipeline to disable batch metrics when the flow metrics are not fully initialized.

Why is it important/What is the impact to the user?

Provides the user a direct information of how much memory the batch flow histograms consumes. With this information the user can select to disable it globally and enable only for single pipelines, if the memory consumption is too big for his setup.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • [ ] I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files (and/or docker env variables)
  • I have added tests that prove my fix is effective or that my feature works

Author's Checklist

  • [ ]

How to test this PR locally

Run Logstash and check that it prints a line like the following, for each pipeline:

Pipeline `main`batch metrics estimated memory occupation: 4925440 bytes

Related issues

Use cases

Screenshots

Logs

[2026-03-30T17:28:45,791][INFO ][org.logstash.execution.AbstractPipelineExt] Pipeline `main`batch metrics estimated memory occupation: 4925440 bytes

@andsel andsel self-assigned this Mar 30, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🤖 GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)
  • run exhaustive tests : Run the exhaustive tests Buildkite pipeline.

@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Mar 30, 2026

This pull request does not have a backport label. Could you fix it @andsel? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit.
  • If no backport is necessary, please add the backport-skip label

final Class<USER_METRIC> type = metricFactory.getType();
if (!type.isAssignableFrom(result.getJavaClass())) {
LOGGER.warn("UserMetric type mismatch for %s (expected: %s, received: %s); " +
LOGGER.warn("UserMetric type mismatch for {} (expected: {}, received: {}); " +
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for reviewer
I scratched this while checking for other errors, so eventually I can split in a separate PR.

andsel added 2 commits March 31, 2026 17:46
…ons classes. This is needed to avoid start collecting such data on not yet fully initialized pipelines.
Comment on lines +235 to 238
pipeline_workers_setting = pipeline_settings_obj.get_setting("pipeline.workers")
allow(pipeline_workers_setting).to receive(:default).and_return(worker_thread_count)

pipeline_settings.each {|k, v| pipeline_settings_obj.set(k, v) }
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for reviewer

This is needed because the above mocking works on the original SETTINGS instance and not on the clone.

input { dummy_input {} }
filter {
#{" nil_flushing_filter {}\n" * 2000}
#{" nil_flushing_filter {}\n" * 2500}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for reviewer
2000 filters produced flaky tests, when run singularly, so raised a little bit the limit.

andsel added 2 commits April 2, 2026 12:30
… metrics. This simplifies the configuration. Instaed of setting also 'pipeline.batch.metrics.sampling_mode' to disabled when 'metric.collect' is set to false. Just set 'metric.collect' to false and batch metricsconsumption log is not printed
@elasticmachine
Copy link
Copy Markdown

💚 Build Succeeded

History

cc @andsel

@andsel andsel changed the title Feature/estimate batch metrics memory consumption and log Log the estimate of batch metrics memory consumption Apr 2, 2026
@andsel andsel marked this pull request as ready for review April 2, 2026 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Expose the size of batch metrics histograms, to be aware of the memory consumption for pipeline.

2 participants