Skip to content

refactor(ops): relay UPDATE streaming task-per-tx (slice C) — #1454 phase 5#3937

Merged
iduartgomez merged 3 commits into
mainfrom
phase-5-update-slice-c
Apr 22, 2026
Merged

refactor(ops): relay UPDATE streaming task-per-tx (slice C) — #1454 phase 5#3937
iduartgomez merged 3 commits into
mainfrom
phase-5-update-slice-c

Conversation

@iduartgomez
Copy link
Copy Markdown
Collaborator

Problem

Issue #1454 Phase 5 migrates relay operations to the task-per-transaction
driver pattern. Slice A (PR #3910) migrated non-streaming UPDATE relays
(UpdateMsg::RequestUpdate, UpdateMsg::BroadcastTo). Streaming
variants (RequestUpdateStreaming, BroadcastToStreaming) and the
deprecated Broadcasting wire variant were left on the legacy
state-machine path — meaning the last DashMap-resident relay UPDATE
amplifier surface (pre-registered UpdateOp + load_or_init/push)
remained uncovered by the slice A dedup gate.

Solution

Add task-per-transaction drivers for both streaming variants in
operations/update/op_ctx_task.rs:

  • start_relay_request_update_streaming — claims inbound stream via
    orphan_stream_registry().claim_or_wait, assembles, deserializes
    UpdateStreamingPayload, applies locally via update_contract,
    emits update_success telemetry. Fire-and-forget: propagation is
    via BroadcastStateChange after local apply.
  • start_relay_broadcast_to_streaming — same stream-claim + assemble
    flow, plus pre-merge dedup cache check (amplifier mitigation), WASM
    merge, update_broadcast_applied telemetry, proactive summary
    spawn on success, and the legacy failure classifier
    (log_broadcast_to_streaming_failure + conditional
    try_auto_fetch_contract / send_summary_back_on_rejection).

Key simplification vs PUT slice B: UPDATE streaming relays are
fire-and-forget. No downstream pipe_stream, no upstream reply, no
request-response bookkeeping.

Dispatch gate in node.rs routes both variants under the same
source_addr.is_some() AND !has_update_op(id) gate as slice A.
Deprecated UpdateMsg::Broadcasting stays on the legacy no-op
handler; pre-registered UpdateOps (GC / originator-loopback) still
fall through to legacy.

Shared counter RELAY_UPDATE_DEDUP_REJECTS and shared tx set
active_relay_update_txs ensure dedup is unified across slice A +
slice C; separate RELAY_UPDATE_STREAMING_* counters give operators
independent visibility during the ramp.

Testing

  • 50 UPDATE unit tests pass (including 9 new slice C structural pin
    tests + 3 updated node.rs dispatch pin tests).
  • Full cargo test -p freenet --lib passes (2462 passed, 16
    ignored).
  • cargo clippy -p freenet --all-targets -- -D warnings clean.
  • cargo fmt clean.

Structural pin tests cover: driver existence, dedup-gate usage (both
drivers increment RELAY_UPDATE_DEDUP_REJECTS), orphan-stream claim
via claim_or_wait, fire-and-forget guarantee (no send_and_await,
no pipe_stream), pre-merge dedup ordering (BroadcastToStreaming
calls broadcast_dedup_cache.check_and_insert before
update_contract), failure classification, proactive summary spawn,
and RAII guard cleanup of streaming-specific counters
(RELAY_UPDATE_STREAMING_INFLIGHT,
RELAY_UPDATE_STREAMING_COMPLETED_TOTAL).

Linux-only multi-loopback streaming end-to-end tests remain covered
by existing /freenet:linux-test infrastructure.

Fixes

Partial progress on #1454 Phase 5. After this merges, only the
deprecated UpdateMsg::Broadcasting wire variant + legacy streaming
arms in update.rs:871-1366 remain — Phase 6 wire cleanup will
delete both.

…hase 5

Add task-per-transaction drivers for `UpdateMsg::RequestUpdateStreaming`
and `UpdateMsg::BroadcastToStreaming` in
`operations/update/op_ctx_task.rs` and extend the UPDATE dispatch gate
in `node.rs` to route fresh inbound streaming relays through them.

Why: streaming UPDATE relays stayed on the legacy state-machine path
after slice A. Bringing them onto task-per-tx restores parity with the
slice A dedup gate, eliminates the last DashMap-resident relay UPDATE
path, and closes the amplifier surface that the non-streaming drivers
already avoid.

Scope:

- New drivers + RAII guard + streaming-specific counters
  (`RELAY_UPDATE_STREAMING_*`) separate from slice A for observability
  ramp, but sharing `active_relay_update_txs` (unified tx dedup) and
  `RELAY_UPDATE_DEDUP_REJECTS` (unified dedup counter).
- Both drivers are fire-and-forget: claim stream via
  `orphan_stream_registry().claim_or_wait`, assemble, deserialize
  payload, apply via `update_contract`, rely on `BroadcastStateChange`
  for propagation. No `pipe_stream`, no `send_and_await`.
- `BroadcastToStreaming` driver preserves the legacy failure
  classification (`log_broadcast_to_streaming_failure` + conditional
  `try_auto_fetch_contract` / `send_summary_back_on_rejection`), the
  pre-merge dedup cache check, and the proactive summary notification
  on successful state change.
- `log_broadcast_to_streaming_failure` promoted from private to
  `pub(crate)` so the driver can reuse the shared classifier.
- Dispatch gate in `node.rs` routes `RequestUpdateStreaming` and
  `BroadcastToStreaming` to the new drivers under the same
  `source_addr.is_some()` AND `!has_update_op(id)` gate as slice A.
- Deprecated `UpdateMsg::Broadcasting` stays on the legacy no-op
  handler; pre-registered `UpdateOp`s (GC / originator-loopback) still
  fall through to legacy.

Structural pin tests cover: driver existence, dedup-gate usage,
orphan-stream claim, fire-and-forget guarantee, pre-merge dedup
ordering, failure classification, proactive summary spawn, and RAII
guard cleanup of streaming-specific counters. Existing slice A pin
test `slice_a_drivers_do_not_touch_streaming_variants` re-scoped to
the slice A fns only; the legacy-path-only dispatch pin test is
replaced with `update_branch_dispatches_streaming_drivers_but_not_broadcasting`.

Refs #1454 phase 5 follow-up slice C.
Update operations.md and crates/core/CLAUDE.md to reflect that streaming
UPDATE relay variants now run on the task-per-tx path alongside PUT
streaming. Only the deprecated `UpdateMsg::Broadcasting` wire variant
remains on the legacy no-op handler (scheduled for Phase 6 wire
cleanup).
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 22, 2026

Rule Review: No issues found

Rules checked: git-workflow.md, code-style.md, testing.md, operations.md
Files reviewed: 5 (operations.md, crates/core/CLAUDE.md, crates/core/src/node.rs, crates/core/src/operations/update.rs, crates/core/src/operations/update/op_ctx_task.rs)

Warnings

None.

Info

  • crates/core/src/operations/update/op_ctx_task.rs (test slice_c_drivers_use_tx_dedup_gate) — the dedup gate check scans a hard-coded 1500-byte window from each entry-point function's start. If the preamble (dedup-reject log block) grows past that bound, the assert would pass while scanning dead zone. Low practical risk since both preambles are under 700 bytes today, but a start..dedup_check_pos approach similar to the ordering test would be more robust. (rule: code-style.md — test correctness)

Checklist against applicable rules:

  • No .unwrap() in production code — new production paths use match/? throughout. ✓
  • Fire-and-forget spawns — all GlobalExecutor::spawn calls are for short-lived per-tx work (relay drivers, send_summary_back_on_rejection, send_proactive_summary_notification), matching the established slice A/B pattern. Not lifetime tasks; BackgroundTaskMonitor not required. ✓
  • Push-before-send (task-per-tx flavour) — drivers are fire-and-forget and never call send_and_await; the slice_c_drivers_are_fire_and_forget pin test confirms this. ✓
  • Dedup gate — both dispatch arms in node.rs are guarded by !op_manager.has_update_op(id), pinned by update_streaming_arms_gate_on_has_update_op. ✓
  • Stale documentationoperations.md and crates/core/CLAUDE.md are both updated in-PR to reflect slice C's migration. ✓
  • Regression test (refactor: PR) — not a fix: PR; structural pin test requirement does not apply. ✓
  • No new biased; select, no retry loops, no unbounded channels, no Instant::now() — none introduced. ✓

Rule review against .claude/rules/. WARNING findings block merge.

@iduartgomez iduartgomez enabled auto-merge April 22, 2026 14:55
@iduartgomez iduartgomez added this pull request to the merge queue Apr 22, 2026
@iduartgomez iduartgomez removed this pull request from the merge queue due to a manual request Apr 22, 2026
Follow-up on PR #3937 review:

1. Pin: streaming dispatch arms (`RequestUpdateStreaming`,
   `BroadcastToStreaming`) must be gated on `!has_update_op(id)`.
   Guards the negative case that was only covered by positive pins
   previously — prevents slice C drivers from stealing pre-registered
   UpdateOps away from the legacy state machine (GC /
   originator-loopback paths still need `load_or_init`/`push`).

2. Document why tx dedup is held across `claim_or_wait`'s timeout
   window. Attacker with bogus `stream_id` can park one tx entry per
   bogus tx for up to `STREAM_CLAIM_TIMEOUT` (60s). Releasing earlier
   would allow duplicate metadata messages to spawn parallel drivers
   on the same tx, amplifying work and telemetry. Bounded, per-tx,
   and visible via `RELAY_UPDATE_STREAMING_INFLIGHT`.

3. Pin: `log_broadcast_to_streaming_failure` must remain `pub(crate)`.
   Slice C driver reuses this classifier for
   `try_auto_fetch_contract` / `send_summary_back_on_rejection`
   branching; re-privatising (e.g. during Phase 6 cleanup) would
   break the streaming broadcast failure path silently.

No behavioural change.
@iduartgomez iduartgomez added this pull request to the merge queue Apr 22, 2026
Merged via the queue into main with commit 61b60e3 Apr 22, 2026
12 checks passed
@iduartgomez iduartgomez deleted the phase-5-update-slice-c branch April 22, 2026 16:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant