Skip to content

fix: capture context slot consistently in getSignatureStatuses RPC#406

Merged
MicaiahReid merged 1 commit intosolana-foundation:mainfrom
serejke:fix-context-of-get-signature-statuses
Nov 13, 2025
Merged

fix: capture context slot consistently in getSignatureStatuses RPC#406
MicaiahReid merged 1 commit intosolana-foundation:mainfrom
serejke:fix-context-of-get-signature-statuses

Conversation

@serejke
Copy link
Copy Markdown
Contributor

@serejke serejke commented Nov 12, 2025

The getSignatureStatuses RPC endpoint had a race condition that caused inconsistent API responses when querying transaction statuses. The context slot was captured after each signature lookup in a loop, rather than once at the beginning of the method call.

Problem Discovery

This bug was discovered during high-throughput transaction processing when:

  • Sending many transactions at once to the network
  • Calling getSignatureStatuses from a background confirmation process
  • Observing that some responses returned null with a high context slot
  • But post-factum finding the transaction was included in an earlier slot

This created an inconsistency where the API would report "transaction not found" with context slot X, but the transaction actually existed in slot Y where Y < X. This violates the semantic contract that if a transaction is not found at slot X, it should not exist in any slot <= X.

Root Cause

The bug was in the implementation of get_signature_statuses(). The method captured the context slot inside the signature lookup loop:

let mut last_latest_absolute_slot = 0;
for signature in signatures.into_iter() {
    let res = svm_locker.get_transaction(...).await?;
    last_latest_absolute_slot = svm_locker.get_latest_absolute_slot();
    responses.push(res.map_some_transaction_status());
}

The RwLock on SurfnetSvm is held only during individual operations, not across the entire RPC call. During the .await points in the async loop, other tasks can acquire the write lock and advance the slot (e.g., by confirming blocks or processing new transactions).

This created a race condition where:

  1. Lookup transaction at slot 100 → not found (perhaps due to timing)
  2. Async await point allows other tasks to run
  3. Another task acquires write lock and advances slot to 101
  4. get_latest_absolute_slot() returns 101
  5. Response reports "not found" with context slot 101
  6. But the transaction actually exists in slot 100

Solution

Capture the context slot once at the beginning of the method, before any signature lookups. This ensures all responses in a batch share the same snapshot view of the blockchain state, even if the slot advances during processing:

let context_slot = svm_locker.get_latest_absolute_slot();
for signature in signatures.into_iter() {
    let res = svm_locker.get_transaction(...).await?;
    responses.push(res.map_some_transaction_status());
}

This provides snapshot consistency: all lookups are evaluated as of the same slot, preventing temporal inconsistencies in batch responses.

Impact

This fix ensures getSignatureStatuses provides consistent, predictable responses during high-throughput transaction processing and parallel confirmation workflows.

The getSignatureStatuses RPC endpoint had a race condition that caused
inconsistent API responses when querying transaction statuses. The context
slot was captured after each signature lookup in a loop, rather than once
at the beginning of the method call.

## Problem Discovery

This bug was discovered during high-throughput transaction processing when:
- Sending many transactions at once to the network
- Calling getSignatureStatuses from a background confirmation process
- Observing that some responses returned null with a high context slot
- But post-factum finding the transaction was included in an earlier slot

This created an inconsistency where the API would report "transaction not
found" with context slot X, but the transaction actually existed in slot Y
where Y < X. This violates the semantic contract that if a transaction is
not found at slot X, it should not exist in any slot <= X.

## Root Cause

The bug was in the implementation of get_signature_statuses() at
crates/core/src/rpc/full.rs:1468-1486. The method captured the context
slot inside the signature lookup loop:

```rust
let mut last_latest_absolute_slot = 0;
for signature in signatures.into_iter() {
    let res = svm_locker.get_transaction(...).await?;
    last_latest_absolute_slot = svm_locker.get_latest_absolute_slot();
    responses.push(res.map_some_transaction_status());
}
```

The RwLock on SurfnetSvm is held only during individual operations, not
across the entire RPC call. During the .await points in the async loop,
other tasks can acquire the write lock and advance the slot (e.g., by
confirming blocks or processing new transactions).

This created a race condition where:
1. Lookup transaction at slot 100 → not found (perhaps due to timing)
2. Async await point allows other tasks to run
3. Another task acquires write lock and advances slot to 101
4. get_latest_absolute_slot() returns 101
5. Response reports "not found" with context slot 101
6. But the transaction actually exists in slot 100

## Solution

Capture the context slot once at the beginning of the method, before any
signature lookups. This ensures all responses in a batch share the same
snapshot view of the blockchain state, even if the slot advances during
processing:

```rust
let context_slot = svm_locker.get_latest_absolute_slot();
for signature in signatures.into_iter() {
    let res = svm_locker.get_transaction(...).await?;
    responses.push(res.map_some_transaction_status());
}
```

This provides snapshot consistency: all lookups are evaluated as of the
same slot, preventing temporal inconsistencies in batch responses.

## Testing

Enhanced test_get_signature_statuses to verify:
- Context slot is captured at the beginning of the call
- Context slot is never 0 when the SVM has advanced
- All signatures in a batch share the same context slot

The test fails with the old implementation (context slot = 0 for empty
queries) and passes with the fix (context slot = 123).

## Impact

This fix ensures getSignatureStatuses provides consistent, predictable
responses during high-throughput transaction processing and parallel
confirmation workflows.
@MicaiahReid
Copy link
Copy Markdown
Collaborator

Thank you for the thoughtful explanation and for the fix, @serejke!

@MicaiahReid MicaiahReid merged commit 66b328f into solana-foundation:main Nov 13, 2025
3 checks passed
@serejke serejke deleted the fix-context-of-get-signature-statuses branch November 14, 2025 09:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants