fix(ai): fix combined orchestrator capacity management#3389
fix(ai): fix combined orchestrator capacity management#3389ad-astra-video merged 10 commits intomasterfrom
Conversation
core/ai_orchestrator.go
Outdated
|
|
||
| if !hasCapacity { | ||
| return false, nil | ||
| } else { |
There was a problem hiding this comment.
nit: since we're doing a return on the line above then we can remove the else and the extra indentation
core/ai_orchestrator.go
Outdated
| if pipeline == "live-video-to-video" { | ||
| orch.node.ReleaseAICapability(pipeline, modelID) | ||
| close(releaseCapacity) | ||
| return true, nil | ||
| } |
There was a problem hiding this comment.
I think we can just remove this because we already track capacity with this containers map. @victorges could you check?
We don't have any call to ReserveAICapability for live video so this call to ReleaseAICapability would cause the Capacity value to always increase and never decrease.
In my PR I was planning to just set the Capacity field based to the number of entries in the containers map for live video.
There was a problem hiding this comment.
I reworked the updates in this function to keep same functionality for live-video-to-video using local ai-worker. This allowed removing this short circuit code noted above.
|
@ad-astra-video could you check the compile errors? https://github.com/livepeer/go-livepeer/actions/runs/14384181138/job/40335296309?pr=3389 |
0a9227d to
c27a376
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3389 +/- ##
===================================================
+ Coverage 30.83933% 30.88542% +0.04609%
===================================================
Files 154 154
Lines 45977 46012 +35
===================================================
+ Hits 14179 14211 +32
- Misses 30973 30975 +2
- Partials 825 826 +1
... and 2 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
I fixed the issue specific to this PR. Looks like the runners are having other issues now tho. |
|
After re-running a few times all actions are now passing. |
What does this pull request do? Explain your changes. (required)
Fixes capacity managment for combined Orchestrator/AI workers using external containers.
Specific updates (required)
How did you test each of these updates (required)
built go-livepeer and sent requests to use all capacity for combined and separate ai-workers.
Does this pull request close any open issues?
Checklist:
makeruns successfully./test.shpass