feat(sdk): external workflow handle#1187
Conversation
91eb358 to
f9a3379
Compare
0ad8c27 to
5a05ba7
Compare
|
@claude review |
| /// Get a handle to an external workflow for sending signals or requesting cancellation. | ||
| pub fn external_workflow( | ||
| &self, | ||
| opts: impl Into<SignalWorkflowOptions>, | ||
| ) -> impl CancellableFuture<SignalExternalWfResult> { | ||
| let options: SignalWorkflowOptions = opts.into(); | ||
| let target = sig_we::Target::WorkflowExecution(NamespacedWorkflowExecution { | ||
| workflow_id: impl Into<String>, | ||
| run_id: Option<String>, | ||
| ) -> ExternalWorkflowHandle { | ||
| ExternalWorkflowHandle { | ||
| workflow_id: workflow_id.into(), | ||
| run_id, | ||
| namespace: self.base.inner.namespace.clone(), | ||
| workflow_id: options.workflow_id, | ||
| run_id: options.run_id.unwrap_or_default(), | ||
| }); | ||
| self.base.clone().send_signal_wf(target, options.signal) | ||
| base_ctx: self.base.clone(), | ||
| } | ||
| } |
There was a problem hiding this comment.
🔴 The new external_workflow() API silently removes cross-namespace cancellation support that existed in the old cancel_external() method -- callers have no way to target a workflow in a different namespace. Additionally, the sends_cancel_canned test was written to validate cross-namespace behavior (namespace: "some_namespace") but now silently tests same-namespace behavior because its assertions only check CommandType and never validate the namespace field in command attributes, so the regression goes completely undetected.
Extended reasoning...
Root cause -- no namespace parameter on ExternalWorkflowHandle:
The old cancel_external() method on SyncWorkflowContext accepted a NamespacedWorkflowExecution struct with an explicit namespace field, allowing workflows to cancel targets in other namespaces. The replacement external_workflow() method (crates/sdk/src/workflow_context.rs lines 812-824) hardcodes namespace: self.base.inner.namespace.clone() when constructing ExternalWorkflowHandle, and ExternalWorkflowHandle::cancel() propagates that hardcoded namespace into the RequestCancelExternalWorkflowExecution command. There is no constructor, builder method, or parameter to override the namespace.
Why the test does not catch this:
The sends_cancel_canned test in cancel_external.rs was explicitly designed to exercise cross-namespace cancellation. Its canned history is built with namespace: "some_namespace" (a deliberately different namespace). But the new CancelSenderCanned workflow calls ctx.external_workflow("fake_wid", Some("fake_rid".into())), which sends the cancel command with the workflow's own namespace rather than "some_namespace". The test still passes because its assertion only checks CommandType::RequestCancelExternalWorkflowExecution -- it never inspects the workflow_execution field of the command attributes to verify the namespace. Mock matching is by sequence number, not command content, so the mismatch is invisible.
Why this is an unintentional regression:
The PR description explicitly documents one intentional removal: header support on signals. It says nothing about removing cross-namespace cancellation support. Cross-namespace signaling was never supported (old signal_workflow also hardcoded the workflow's own namespace), but cross-namespace cancellation WAS supported and actively tested -- making this a silent behavioral regression, not an intentional design decision. One verifier refutation argued that the test issue is merely a duplicate symptom of the API issue; while both share the same root cause, the test validity issue stands independently: even if the API regression is accepted as intentional, the canned history with namespace: "some_namespace" is now misleading and should be updated regardless.
Impact:
Any workflow that previously called cancel_external() targeting a NamespacedWorkflowExecution with a different namespace will silently send the cancel to the wrong namespace after migrating to external_workflow(). The cancel command will be emitted without error, but the target workflow in the other namespace will not receive it. This is a silent correctness issue for multi-namespace Temporal deployments.
Step-by-step proof:
- Workflow A runs in namespace
"default"and previously called:
ctx.cancel_external(NamespacedWorkflowExecution { namespace: "other_ns", workflow_id: "target", run_id: "" }, "reason".to_string()) - After migration to ExternalWorkflowHandle:
let h = ctx.external_workflow("target", None); h.cancel(Some("reason".into())).await; external_workflow()setsExternalWorkflowHandle { namespace: "default", ... }(own namespace)cancel()emitsRequestCancelExternalWorkflowExecutionwithnamespace: "default", workflow_id: "target"- The server receives a cancel for
"target"in"default", not"other_ns"-- the cancel is silently misdirected and the target workflow is unaffected.
How to fix:
If cross-namespace cancellation is intentionally removed (to match other SDKs), the PR description should document this explicitly and the sends_cancel_canned canned history should be updated to use the workflow's own namespace, with assertions that verify namespace content in command attributes so future regressions are caught. If cross-namespace targeting should be preserved, external_workflow() needs an optional namespace parameter defaulting to the current workflow's namespace, and the test should assert the namespace field in command attributes.
There was a problem hiding this comment.
This feature has been disabled in Temporal Cloud and is disabled by default in OSS (PR). Since this feature is being deprecated/disabled going forward, it was removed, but unsure if correct.
What was changed
Adds a
ExternalWorkflowHandlethat can be used to send signals or cancel external workflows from the workflow context.This handle and it's methods replace
cancel_externalandsignal_workflow.Why?
This matches other SDKs where one first obtains an external workflow handle. The new
ExternalWorkflowHandle::signalmethod also now matches how one sends signals to a workflow from a client.Note this PR removes the ability to set headers on signals from the workflow context. This matches other SDKs. This functionality will be handled by interceptors once those are supported.
Checklist
Closes N/A
How was this tested:
Added a few tests, but mostly updated existing ones to use the new external workflow handle instead of the previous methods.
Any docs updates needed?
N/A