Skip to content

[9.4](backport #50137) filestream: Fix shutdown logic and improve benchmark#50184

Open
mergify[bot] wants to merge 2 commits into9.4from
mergify/bp/9.4/pr-50137
Open

[9.4](backport #50137) filestream: Fix shutdown logic and improve benchmark#50184
mergify[bot] wants to merge 2 commits into9.4from
mergify/bp/9.4/pr-50137

Conversation

@mergify
Copy link
Copy Markdown
Contributor

@mergify mergify Bot commented Apr 17, 2026

Proposed commit message

Fix filestream benchmark correctness and shutdown, add high-file-count sub-benchmarks

Shutdown fix (notifyObserver):

During shutdown the watcher goroutine that drains notifyChan exits before
harvesters finish. The old blocking send in notifyObserver stalled every
closing harvester until the task group's 1-minute timeout expired. Replace the
blocking send with a select on canceler.Done() so harvesters unblock
immediately when the input is cancelled.

Benchmark fixes (correctness):

The inode-mode benchmarks were silently broken: file_identity defaulted to
fingerprint even though prospector.scanner.fingerprint.enabled was false,
so every file received the same empty-fingerprint identity and only one
harvester was started out of N files. Explicitly set file_identity.native /
file_identity.fingerprint to match the scanner mode so each file gets its own
identity and harvester.

Benchmark fixes (hangs / timeouts):

Without close.reader.on_eof: true harvesters waited for more data after EOF,
preventing the pipeline from closing until the 60-second task.Group.Stop
timeout expired. Combined with a 1-second check_interval this made multi-file
benchmarks extremely slow. Set close.reader.on_eof: true and lower
check_interval to 100 ms so harvesters close promptly.

Benchmark refactoring:

  • Consolidate duplicated single-file / multi-file / inode / fingerprint
    sub-benchmarks into a table-driven loop.
  • Add 1 000-file and 10 000-file fingerprint sub-benchmarks to stress per-file
    overhead (logger cloning, reader pipeline setup, fingerprint I/O).
  • Replace deprecated logging setup with local loggers.
  • Only buffer the event channel when events are actually collected, avoiding a
    10 000-slot channel allocation in benchmarks that discard events.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works. Where relevant, I have used the stresstest.sh script to run them under stress conditions and race detector to verify their stability.
  • I have added an entry in ./changelog/fragments using the changelog tool.

How to test this PR locally

cd filebeat
go test -bench BenchmarkFilestream -benchtime 1x ./input/filestream/...
```<hr>This is an automatic backport of pull request #50137 done by [Mergify](https://mergify.com).

Fix filestream benchmark correctness and shutdown, add high-file-count sub-benchmarks

**Shutdown fix (`notifyObserver`):**

During shutdown the watcher goroutine that drains `notifyChan` exits before
harvesters finish. The old blocking send in `notifyObserver` stalled every
closing harvester until the task group's 1-minute timeout expired. Replace the
blocking send with a `select` on `canceler.Done()` so harvesters unblock
immediately when the input is cancelled.

**Benchmark fixes (correctness):**

The inode-mode benchmarks were silently broken: `file_identity` defaulted to
`fingerprint` even though `prospector.scanner.fingerprint.enabled` was `false`,
so every file received the same empty-fingerprint identity and only one
harvester was started out of N files. Explicitly set `file_identity.native` /
`file_identity.fingerprint` to match the scanner mode so each file gets its own
identity and harvester.

**Benchmark fixes (hangs / timeouts):**

Without `close.reader.on_eof: true` harvesters waited for more data after EOF,
preventing the pipeline from closing until the 60-second `task.Group.Stop`
timeout expired. Combined with a 1-second `check_interval` this made multi-file
benchmarks extremely slow. Set `close.reader.on_eof: true` and lower
`check_interval` to 100 ms so harvesters close promptly.

**Benchmark refactoring:**

- Consolidate duplicated single-file / multi-file / inode / fingerprint
  sub-benchmarks into a table-driven loop.
- Add 1 000-file and 10 000-file fingerprint sub-benchmarks to stress per-file
  overhead (logger cloning, reader pipeline setup, fingerprint I/O).
- Replace deprecated logging setup with local loggers.
- Only buffer the event channel when events are actually collected, avoiding a
  10 000-slot channel allocation in benchmarks that discard events.

(cherry picked from commit 2977528)
@mergify mergify Bot added the backport label Apr 17, 2026
@mergify mergify Bot requested a review from a team as a code owner April 17, 2026 09:58
@mergify mergify Bot removed the request for review from a team April 17, 2026 09:58
@mergify mergify Bot added the backport label Apr 17, 2026
@mergify mergify Bot requested review from faec and orestisfl April 17, 2026 09:58
@botelastic botelastic Bot added the needs_team Indicates that the issue/PR needs a Team:* label label Apr 17, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🤖 GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@github-actions github-actions Bot added bug Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team skip-changelog labels Apr 17, 2026
@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@botelastic botelastic Bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Apr 17, 2026
@mergify
Copy link
Copy Markdown
Contributor Author

mergify Bot commented Apr 20, 2026

This pull request has not been merged yet. Could you please review and merge it @orestisfl? 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants