ai/live: Remote signer implementation for tickets#3822
Conversation
|
@j0sh thanks for the draft pull request. I think this is very cleanly implemented, and I don’t have any direct concerns from my side. |
| } | ||
|
|
||
| // Forward payment + segment credentials to orchestrator | ||
| url := segmentInfo.sess.OrchestratorInfo.Transcoder |
There was a problem hiding this comment.
@j0sh Would it make sense to put this logic in a shared helper since it’s used by both signers, or is it better to wait given that both signers are still in flux?
There was a problem hiding this comment.
shared helper since it’s used by both signers
Maybe I am missing something really obvious, but what is this other signer?
There was a problem hiding this comment.
Ah sorry the signing a gateway does when nog using remote signer. However I do remember your goal of not interfering with existing code for now and the code is minimal. Just some thing I was wondering.
There was a problem hiding this comment.
Oh yes I see now.
There is additional structural similarity here: the non-remote livePaymentSender can probably be split into two pieces, similar to how the remotePaymentSender API is split between RequestPayment (to retrieve / generate the tickets) and SendPayment (to actually send the tickets). This would clean up a bit of lingering cruft around job initialization (see the PR description).
When we do this, we can also move to using a shared implementation to send HTTP payments. But yes let's hold that thought until after these changes are merged, so we can verify one thing at a time.
| ManifestID string | ||
|
|
||
| // Number of pixels to generate a ticket for. Required if `type` is not set. | ||
| InPixels int64 `json:"inPixels"` |
There was a problem hiding this comment.
@j0sh should we move away from pixel-based pricing to a more general pricing model that also supports BYOC's per-second compute, or wait until the payment clearinghouse is validated? I see you already added a type field below to extend this to other job types later.
There was a problem hiding this comment.
For what it's worth, live-video-to-video also uses per-second payments, with a fixed size "base unit" of 720p@30fps. Avoiding any changes to the current live-video-to-video pricing model was an explicit goal here, so I would not change anything about the pricing model right now.
Different job types can be supported later on, either by adding a new field to this struct or setting the types field to a specific value.
| return | ||
| } | ||
|
|
||
| // Generate segment credentials with an empty segment |
There was a problem hiding this comment.
@j0sh should we move this down below to where is used?
server/remote_signer.go
Outdated
| } | ||
|
|
||
| // Compute required fee | ||
| fee := calculateFee(pixels, priceInfo) |
There was a problem hiding this comment.
@j0sh similar to comment above. I think it would be nice to make this function more general and make it a shared helper by chaning pixels to units so we can share logic between BYOC and live-video-to-video.
|
@j0sh, I reviewed this pull request and left a few minor comments. Similar to #3791, the overall approach and logic here make sense to me. With that in mind, I think this can be merged once you and @ad-astra-video are comfortable that the implementation leaves room for future extension to BYOC use cases, along with adding tests and completing E2E validation. If we set aside BYOC batch legacy payments for now, both systems already rely on streaming-based payments. While BYOC streaming and Live Video-to-Video currently use separate payment endpoints ( Based on a quick review (and please correct me if I’m mistaken, @ad-astra-video, @j0sh), the remaining differences seem to be:
Given this, my current recommendation would be to gradually converge the streaming paths by:
Although it’s still early and we haven’t had many discussions yet, I do like the stateless remote signer approach taken in this pull request. Keeping the signer lean and predictable, without persistent state or coordination requirements, feels for now like the right tradeoff. It reduces operational complexity, avoids shared databases, and preserves flexibility for gateways and orchestrators. This also aligns well with future verifiable compute work and potential dispute mechanisms, while keeping the orchestrator as the final enforcer. I think this recommendation is similar to my recommendation on the BYOC Streaming stating updates |
rickstaa
left a comment
There was a problem hiding this comment.
Similar to #3791, I’ve approved the technical implementation since I don’t see any major concerns outside the comments listed above. This should allow us to start testing with the INC infrastructure first. You can coordinate with @ad-astra-video on follow-up PRs to support BYOC, add tests, and perform end-to-end verification.
210bcbf to
e958057
Compare
This PR is the first part of the remote signing feature, implementing the GetOrchestratorInfo request. See the design background for additional motivation and design detail around remote signers. Remote signing support for Live AI (live-video-to-video) consists of two parts: GetOrchestratorInfo (this PR) PM ticket retrieval (ai/live: Remote signer implementation for tickets #3822) This PR adds a new mode to go-livepeer: -remoteSigner. The remote signer exposes a HTTP POST endpoint at /sign-orchestrator-info. It currently does only does one thing: produces a signature for use in the OrchestratorInfo gRPC call. The response includes the signer's Ethereum address as well as the signature itself. Like the other types of mode flags (gateway, orchestrator, redeemer, etc) the remote signer cannot be combined with other modes. The gateway adds a new -remoteSignerUrl flag which specifies the base address of the remote signer to use (host:port). When configured, the gateway pre-fetches the remote signature for OrchestratorInfo at start-up time and caches it, so subsequent calls to OrchestratorInfo do not have to incur additional remote calls. One might note that this OrchestratorInfo signing scheme doesn't actually accomplish much at all: the signature effectively static, not scoped to any one orchestrator, and never expires. This is a legacy aspect of the OrchestratorInfo flow that we have to accommodate for now; it is a vestigial corner of the codebase, but one that we've judged internally to be harmless.
bb04b97 to
4a5c0c0
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3822 +/- ##
===================================================
+ Coverage 32.19428% 32.46227% +0.26799%
===================================================
Files 170 170
Lines 41321 41673 +352
===================================================
+ Hits 13303 13528 +225
- Misses 27016 27124 +108
- Partials 1002 1021 +19
... and 3 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
This makes the orchestrator return the expected pricing with updated OrchestratorInfo after payment requests.
If nil caps are passed in to the remote signer, the orch does not return a PaymentResult-wrapped OrchestratorInfo after payment that the remote signer can then use. (This is OK; the same OrchestratorInfo can be reused until it expires.) Since the payment response effectively becomes optional, let's not mandate it be wrapped in a PaymentResult. This also makes it somewhat easier for clients to use the OrchestratorInfo data directly from GetOrchestrator without additional protobuf shenanigans.
Enable remote signers to check orchestrator prices against configured max price limits (-maxPricePerUnit and -maxPricePerCapability flags). This prevents remote signers from generating payments to orchestrators that exceed acceptable pricing thresholds. Changes: - Add HTTPStatusPriceExceeded (481) response code for price rejections - Validate orchestrator price in GenerateLivePayment() after session setup - Add test coverage for price validation and rejection scenarios The price check supports capability-specific pricing and returns HTTP 481 when an orchestrator's price exceeds the configured maximum, allowing the gateway to select a different orchestrator.
Usually there are constraints in place alongside the capabilities in order to set a per-capability price. However, if no constraints are set, there are no per-capability prices, so return the global max price instead. This hasn't come up in prod (and fixes a crash) so shouldn't be a breaking change.
This aligns closer to how the current lv2v code works, which captures the price info at the beginning of the session and reuses it for subsequent payments.
…a/remote-signer-tickets
|
Couple small behavioral changes since the original draft PR:
|
TODO
This PR completes the remote signing feature, allowing gateways to retrieve PM tickets for Live AI (live-video-to-video) without requiring any on-chain connectivity or possession of an Ethereum signing key. See the design background for additional motivation and design detail around remote signers. Refer to #3791 for instructions on how to enable this feature.
Retrieving tickets is mostly done via implementing the LivePaymentSender interface with a new implementation:
remotePaymentSenderinlive_payment.go. The LivePaymentSender implementations (signer or non-signer) is also initialized earlier in the process, before an orchestrator is requested, and stored in the LiveParams struct. This is so the gateway can send an upfront payment to the orchestrator using remote signers. Processing remote payment signing requests happens in theremote_signer.gofile.When a job first starts, the gateway sends an upfront payment to the orchestrator encoded in the initial request header. To support this, the API for the
remotePaymentSenderalso offers a standaloneRequestPaymentmethod to retrieve signed tickets without sending them. The non-remote signer does not have a clean, singular method to retrieve tickets; at some point we may codify this behind a proper interface and clean up this bit, but that can come later to avoid introducing additional concepts to an already involved PR.Remote Signing Protocol
Refer to the design document for context behind the design of the protocol. Here is some more detail on that:
sequenceDiagram participant O as Orchestrator participant G as Gateway participant S as Signer %% Initial session setup G->>S: getOrchInfoSig() S-->>G: gatewaySig G->>O: getOrchInfo(gatewaySig) O-->>G: ticketParams₀ %% First signing call (no prior signer state) Note over S: state is null → create fresh signer state G->>S: signTicket(state=null, ticketParams₀) S-->>G: signedTicket₀, signerState₀ G->>O: pay(signedTicket₀) O-->>G: ticketParams₁ G->>S: signTicket(signerState₀, ticketParams₁) S-->>G: signedTicket₁, signerState₁ %% Subsequent calls (k = 1..N) loop For each k = 1..N Note over S: NB: ticketParamsₖ reusable between<br>calls as long as it is valid but not<br>signedTicketₖ or signerStateₖ G->>S: signTicket(signerStateₖ₋₁, ticketParamsₖ) S-->>G: signedTicketₖ, signerStateₖ G->>O: pay(signedTicketₖ) O-->>G: ticketParamsₖ₊₁ endPM Changes
All the changes here are used only by the remote signer, so the impact on the existing code is minimal.
The
Senderinterface adds two new methods: a StartSessionWithNonce constructor, and aNonceaccessor. The nonce is a (mostly internal) PM construct that allows for multiple tickets to be generated using the same set of PM parameters. For ordinary signers, theSenderpersists for the duration of the session, so the nonce would stay internal and be incremented as necessary. However, since remote signers are stateless, the nonce needs to be extracted and set with each signing call, and that is what we do here.The
Balancestruct has a newReserve()method added to zero out the current balance. This addition makes theBalancemore closely mirror the API of the nestedAddressBalances()list. (Otherwise I would have chosen a better name than "Reserve" to zero out a balance.)Note that the
Balancesobject itself hides quite a bit of nested global accouting that we don't strictly need here, and it would be much neater to not have to use these in favor of strictly request-local accounting. However, this would make the rest of the implementation more complex, since theBroadcastSessionworks on the globalBalancesand most of the payment helper functions themselves take aBroadcastSession... so here we go.There is also a small change in starter.go to initialize more PM and Ethereum scaffolding (watchers etc) when the node starts up in remote signer mode.