Rebatch after filters when using pipeline.ordered#11710
Rebatch after filters when using pipeline.ordered#11710colinsurprenant merged 1 commit intoelastic:masterfrom
Conversation
d46a33d to
09a7059
Compare
|
I performed the same benchmarks as in #11524 (comment) (but on different hardware, but it's not really important since relative numbers are interesting here). Also, I noticed variations of ~5-10% between runs with the same parameters which I attribute to CPU throttling. Config1Java Execution
Ruby Execution
Config2Java Execution
Ruby Execution
|
yaauie
left a comment
There was a problem hiding this comment.
First pass review: looks like we are headed in the right direction. The intermediate vertex is a clever solution that allows us to keep the graph validation logic, which wouldn't be possible if we ended up with two graphs.
I've left a number of comments and nitpicks, more discussion points than hard requests. The refactors you've brought in give us some opportunity to go a small step further in a few places to provide additional clarity and clearer abstractions.
logstash-core/src/main/java/org/logstash/execution/WorkerLoop.java
Outdated
Show resolved
Hide resolved
logstash-core/src/main/java/org/logstash/config/ir/graph/SeparatorVertex.java
Show resolved
Hide resolved
logstash-core/src/main/java/org/logstash/config/ir/graph/SeparatorVertex.java
Outdated
Show resolved
Hide resolved
logstash-core/src/main/java/org/logstash/config/ir/CompiledPipeline.java
Outdated
Show resolved
Hide resolved
logstash-core/src/main/java/org/logstash/execution/WorkerLoop.java
Outdated
Show resolved
Hide resolved
andsel
left a comment
There was a problem hiding this comment.
Nice job @colinsurprenant I've added some nits (probably almost all on personal taste) and a note on a possible recursion (IDEA doesn't show me as it, and I've not tried locally, but I suspect it)
Do you plan to unit ad tests, or the existing give us good confidence?
logstash-core/src/main/java/org/logstash/config/ir/CompiledPipeline.java
Outdated
Show resolved
Hide resolved
logstash-core/src/main/java/org/logstash/config/ir/CompiledPipeline.java
Outdated
Show resolved
Hide resolved
logstash-core/src/main/java/org/logstash/config/ir/CompiledPipeline.java
Outdated
Show resolved
Hide resolved
logstash-core/src/main/java/org/logstash/config/ir/CompiledPipeline.java
Outdated
Show resolved
Hide resolved
logstash-core/src/main/java/org/logstash/config/ir/compiler/DatasetCompiler.java
Outdated
Show resolved
Hide resolved
logstash-core/src/main/java/org/logstash/config/ir/graph/SeparatorVertex.java
Show resolved
Hide resolved
|
@andsel @yaauie I think we are ready for a another round of review - as per my comments
|
yaauie
left a comment
There was a problem hiding this comment.
I think this is looking good, and concentrates the complexity in appropriate places.
I've added a couple nitpicks, praises, and one suggestion to squash a build warning that fails the build.
logstash-core/src/main/java/org/logstash/config/ir/CompiledPipeline.java
Show resolved
Hide resolved
logstash-core/src/main/java/org/logstash/config/ir/CompiledPipeline.java
Show resolved
Hide resolved
logstash-core/src/main/java/org/logstash/config/ir/compiler/DatasetCompiler.java
Show resolved
Hide resolved
logstash-core/src/main/java/org/logstash/execution/WorkerLoop.java
Outdated
Show resolved
Hide resolved
logstash-core/src/main/java/org/logstash/config/ir/CompiledPipeline.java
Outdated
Show resolved
Hide resolved
|
Only a really minor comment, LGTM |
yaauie
left a comment
There was a problem hiding this comment.
LGTM with the deferrals noted above.
1816614 to
ba3ca0f
Compare
Fixes #11550
Relates to #11175
WIP TODOs
filtersExecution+lastFilterlogicDescription
When using
pipeline.orderedthe batch events are sent 1 by 1 down the filters and output. This PR rebatches after the filters so that full batches can be sent to the outputs. It does that by separating the filters and output execution with the introduction of a separator vertex in the pipeline graph between the filters and outputs and compiling both sections separately and in the worker_loop calling both compute separately which provides the possibility to rebatch after the filters.