[Telemetry] Improve isTaskManagerReady checks for telemetry#52446
[Telemetry] Improve isTaskManagerReady checks for telemetry#52446chrisronline wants to merge 1 commit intoelastic:masterfrom
Conversation
| async function isTaskManagerReady(server: Server) { | ||
| return (await getLatestTaskState(server)) !== null; | ||
| const result = await getLatestTaskState(server); | ||
| const runs = get(result, '[0].state.runs', 0); |
There was a problem hiding this comment.
You have this same line in a separate file, const runs = get(result, '[0].state.runs', 0);, and it seems specific enough to warrant a helper function with a comment. For example, I'm not entirely sure what it's doing, why, or whether or not it's correct, so a function + comment would be helpful.
isTaskManagerReady and runs being greater than 0 seem like orthogonal things to me.
There was a problem hiding this comment.
That's fair, and honestly, this is an open discussion about a right fix for this.
isTaskManagerReady could be renamed to something like isReadyToCollectTelemetry and that can return what we really care about - if we can now query the task manager and it will return us the telemetry data. We don't want to query the task manager too soon (before the task actually ran), but this idea of too soon is something I just recently observed in testing. Perhaps this is not the root issue?
|
Tests will continue to fail until #52834 is merged |
💔 Build FailedHistory
To update your PR or re-run it, just comment with: |
Related to https://github.com/elastic/elastic-stack-testing/pull/451
In a few plugins, we are using the task manager as a way to asynchronously fetch telemetry data. In order for telemetry collection to know to wait for this asynchronous action, there is an
isReadycheck on each usage collector that returns true/false. All of these plugins use a naive check to see if the task manager is running, but this is apparently not enough.It's a race condition, but there are time where
isReady()is returningtrue, but the task manager hasn't actually ran the task to fetch the telemetry data.I think we can avoid this by not returning
truefromisReady()until the task has actually been ran.As a quick side note, I discovered #52439 while working on this which affects maps telemetry