Skip to content

Re-attempt surreal link additions up to 5 times if they hit read-write conflicts#640

Merged
lucksus merged 4 commits intodevfrom
re-attempt-surreal-link-addition-on-read-write-conflicts
Dec 3, 2025
Merged

Re-attempt surreal link additions up to 5 times if they hit read-write conflicts#640
lucksus merged 4 commits intodevfrom
re-attempt-surreal-link-addition-on-read-write-conflicts

Conversation

@jhweir
Copy link
Contributor

@jhweir jhweir commented Dec 3, 2025

Retry SurrealDB Link Addition on Read/Write Conflicts

Summary

This PR improves the reliability of link additions to SurrealDB within the PerspectiveInstance by implementing a retry mechanism for transient read/write conflicts. When a Failed to commit transaction due to a read or write conflict error is encountered, the system will now retry the addition up to 5 times with exponential backoff.

Motivation

SurrealDB transactions can occasionally fail due to concurrent read/write conflicts, especially under high load or rapid updates. Previously, such failures would result in lost link additions and warning logs. This change ensures that temporary conflicts do not cause data loss, improving robustness and consistency.

Changes

  • Added a retry loop to update_surreal_cache for link additions.
  • On conflict error, retries up to 5 times, increasing the delay between attempts.
  • Logs a warning only if all retries fail.

Impact

  • Reduces risk of missing links due to transient SurrealDB transaction conflicts.
  • Improves data consistency for perspectives under concurrent operations.

Summary by CodeRabbit

  • Bug Fixes
    • Improved reliability of link updates by adding automatic retries with backoff for transient database conflicts. Removals are now attempted before additions, and both paths share the same retry behavior, reducing failed operations and improving overall stability and resilience.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 3, 2025

Warning

Rate limit exceeded

@jhweir has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 22 minutes and 58 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between f1ef50f and 1617d5b.

📒 Files selected for processing (1)
  • rust-executor/src/perspectives/perspective_instance.rs (2 hunks)

Walkthrough

Introduced an internal async helper retry_surreal_op in rust-executor/src/perspectives/perspective_instance.rs and replaced direct SurrealDB link mutations in update_surreal_cache with calls to that helper. Removals are performed before additions; both paths share the same retry/backoff strategy for transient "read/write conflict" errors (up to 5 attempts, 100ms×attempt backoff).

Changes

Cohort / File(s) Summary
Retry wrapper + update flow change
rust-executor/src/perspectives/perspective_instance.rs
Added retry_surreal_op async helper to retry SurrealDB operations on transient "read/write conflict" errors (max 5 attempts, sleep 100ms × attempt). Replaced direct remove_link/add_link calls in update_surreal_cache with the new helper. Enforced removals-before-additions ordering; unified retry logic and preserved existing logging on non-transient failures or exhausted retries.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Review retry_surreal_op implementation for correct backoff and attempt logic.
  • Verify string matching for detecting transient "read/write conflict" errors and that other error paths are handled consistently.
  • Confirm removals-before-additions ordering is correctly implemented and that logging/context is preserved.

Poem

🐇 I hopped on code with careful paws,
I watched the links avoid the claws,
Five tiny waits, a patient tap,
Removals first—then bridge the gap. 🥕

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main change: implementing retry logic for SurrealDB link operations on read-write conflicts, which aligns with the core objective of the pull request.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
rust-executor/src/perspectives/perspective_instance.rs (1)

1686-1705: Retry logic is sound; consider extracting the error message string and clarifying per-attempt logging.

The loop correctly handles SurrealDB read-write conflicts: it performs up to 5 retries (6 total attempts) with linear backoff delays of 100/200/300/400/500ms, breaking on success, non-conflict errors, or exhausted retries. This aligns with SurrealDB's transaction semantics and is an appropriate response to transient conflicts.

Two refinements worth considering:

  • Extract the hardcoded error message to a const to avoid duplication and make it easier to maintain if the error string changes:

    const SURREAL_RW_CONFLICT: &str = "Failed to commit transaction due to a read or write conflict";
    if msg.contains(SURREAL_RW_CONFLICT) && attempts < max_attempts {
  • Include the attempt count in the warning log so failures are easier to diagnose:

    log::warn!(
        "Failed to add link to SurrealDB for perspective {} after {} attempts: {:?}",
        uuid,
        attempts + 1,
        e
    );

If you'd like, you can refactor to a for-loop for clearer iteration semantics, but the current while-loop is functionally correct.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 00513dc and d6663b1.

📒 Files selected for processing (1)
  • rust-executor/src/perspectives/perspective_instance.rs (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: lucksus
Repo: coasys/ad4m PR: 619
File: rust-executor/src/perspectives/perspective_instance.rs:639-641
Timestamp: 2025-09-03T15:16:29.623Z
Learning: In the AD4M codebase, database operations like `add_many_links` during critical synchronization processes (such as `diff_from_link_language`) should use `.expect()` rather than error logging, as failures indicate severe system issues where fail-fast behavior is preferred over continuing with potentially corrupted state.
📚 Learning: 2025-09-03T15:16:29.623Z
Learnt from: lucksus
Repo: coasys/ad4m PR: 619
File: rust-executor/src/perspectives/perspective_instance.rs:639-641
Timestamp: 2025-09-03T15:16:29.623Z
Learning: In the AD4M codebase, database operations like `add_many_links` during critical synchronization processes (such as `diff_from_link_language`) should use `.expect()` rather than error logging, as failures indicate severe system issues where fail-fast behavior is preferred over continuing with potentially corrupted state.

Applied to files:

  • rust-executor/src/perspectives/perspective_instance.rs

lucksus
lucksus previously approved these changes Dec 3, 2025
Copy link
Member

@lucksus lucksus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would need the same for the link removals above, no?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
rust-executor/src/perspectives/perspective_instance.rs (1)

1622-1695: Refine retry_surreal_op: unused param, backoff semantics, and string‑match fragility

  • link is never used; either log it (e.g., source/predicate/target for easier debugging) or rename to _link to avoid warnings.
  • The backoff is linear (100ms * attempts), while the PR description calls it “exponential backoff”. Either adjust the delay to be truly exponential (e.g., base * 2^attempt) or update the description/comments so they match the actual behavior.
  • Conflict detection depends on matching the exact SurrealDB error text via msg.contains("Failed to commit transaction due to a read or write conflict"). This is brittle to library version/message changes; please double‑check this string against the SurrealDB version(s) you target and consider isolating it as a constant or leveraging a more structured error variant if available.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 883b12d and f1ef50f.

📒 Files selected for processing (1)
  • rust-executor/src/perspectives/perspective_instance.rs (2 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: lucksus
Repo: coasys/ad4m PR: 619
File: rust-executor/src/perspectives/perspective_instance.rs:639-641
Timestamp: 2025-09-03T15:16:29.623Z
Learning: In the AD4M codebase, database operations like `add_many_links` during critical synchronization processes (such as `diff_from_link_language`) should use `.expect()` rather than error logging, as failures indicate severe system issues where fail-fast behavior is preferred over continuing with potentially corrupted state.
📚 Learning: 2025-09-03T15:16:29.623Z
Learnt from: lucksus
Repo: coasys/ad4m PR: 619
File: rust-executor/src/perspectives/perspective_instance.rs:639-641
Timestamp: 2025-09-03T15:16:29.623Z
Learning: In the AD4M codebase, database operations like `add_many_links` during critical synchronization processes (such as `diff_from_link_language`) should use `.expect()` rather than error logging, as failures indicate severe system issues where fail-fast behavior is preferred over continuing with potentially corrupted state.

Applied to files:

  • rust-executor/src/perspectives/perspective_instance.rs

@lucksus lucksus merged commit 078be9d into dev Dec 3, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants