Conversation
eaede77 to
906b9dc
Compare
inqueue
left a comment
There was a problem hiding this comment.
Thank you for this. I left some minor suggestions.
Co-authored-by: Jason Bryan <jbryan@elastic.co> Co-authored-by: Brad Deam <54515790+b-deam@users.noreply.github.com>
gbanasiak
left a comment
There was a problem hiding this comment.
Many thanks! This helps in building the Rally mental map.
I went through PrepareBenchmark so far. A few remarks, a wish list for later mostly:
- For consistency,
TrackPreparatorshould be renamed toTrackPreparationActorin the charts as that's the name of the class I think. - It would be great to include 10k feet description of what each component is doing, e.g. 1)
TrackPreparationActorloads a track and track plugins (custom extensions such as parameter sources and runners), gets a list of track processors (TrackProcessorinterface) fromTrackProcessorRegistryand stores them in a queue. Then it sends each processor for execution toTaskExecutionActor. 2)TackExecutionActorexecutes each processor in a background task and verifies its status periodically. Once task completes, it reports its readiness withWorkComplete. Not sure what level of details makes sense here - too much is just a repetition of code, too little forces us to go through the flow again if we revisit this after enough time. - It would be nice to explain how individual components grow as the scale of the system increases, e.g.
TrackPreparationActorset grows with the number of load driver hosts, whileTaskExecutionActorset can grow with the number of cores declared per load driver.
docs/architecture/actor_system.md
Outdated
|
|
||
| At its heart, Rally is a distributed system. It has been designed that way to | ||
| allow using multiple load drivers in the same benchmark, to ensure that Rally | ||
| is never a bottleneck. In the vast majority of cases, using a powerful load |
There was a problem hiding this comment.
Another reason (the first one, IIRC) is that distributing Rally also provides an easy way to handle multiple target hosts and ensure that they all build/install/start the required ES version via the race subcommand (this is for example the mode used on the Hetzner nightly benchmarks).
Later on we got the dedicated/decoupled esrally build/install/start/stop subcommands (which we use e.g. with cloud based benchmarks) and these are decoupled from the actor system.
There was a problem hiding this comment.
The current formulation isn't correct then, however my understanding is that this is achievable without a full-fledged actor system, while we could easily end up re-implementing Thespian badly to support multiple load drivers.
There was a problem hiding this comment.
however my understanding is that this is achievable without a full-fledged actor system, while we could easily end up re-implementing Thespian badly to support multiple load drivers.
I mostly agree. I don't think the risk of supporting >1 load drivers without the actor system would necessarily result in re-implementing Thespian, but as always, the devil is in the details.
I fixed |
dliappis
left a comment
There was a problem hiding this comment.
This LGTM -- so great to see these internal aspects of Rally documented.
One thing I wanted to ask is how come this isn't living under the official docs e.g. in the Reference Documentation or Additional Information sections? (is it because Mermaid isn't available?)
It should be possible to use Mermaid with Sphinx but it would require additional package, see https://github.com/mgaitan/sphinxcontrib-mermaid. That's work in progress. If/when this matures we can include this in official documentation. |
You can see this live at https://github.com/pquentin/rally/blob/document-actor-system/docs/architecture/actor_system.md.