Switch SurrealDB from In-Memory to File-Based RocksDB Storage by lucksus · Pull Request #652 · coasys/ad4m

lucksus · 2026-01-28T22:24:17Z

Summary

This PR migrates perspective link storage from in-memory SurrealDB to persistent file-based SurrealDB databases using RocksDB. Each perspective now has its own dedicated database file, eliminating data duplication and providing persistence across restarts.

Motivation

Previous Architecture:

Links duplicated in both Rusqlite (persistent) and in-memory SurrealDB (fast but ephemeral)
High RAM footprint which kept growing with usage

New Architecture:

File-based SurrealDB with RocksDB as single source of truth
Each perspective: {app_data_path}/surrealdb_perspectives/{perspective_uuid}/
No duplication, no sync needed
Persistent and crash-resilient

Key Changes

Storage Layer (`src/surreal_service/mod.rs`)

Changed from Mem engine to RocksDb engine with file-based storage
Added UNIQUE index on links to prevent duplicates at database level
Made add_link() idempotent (handles concurrent writes gracefully)
Fixed remove_link() precision (includes author + timestamp)
Added retry logic for RocksDB locks and transaction conflicts

Perspective Management (`src/perspectives/mod.rs`, `perspective_instance.rs`)

All link operations now go through SurrealDB exclusively (removed Rusqlite duplication)
Added app data path management for database file locations
Added RocksDB cleanup on perspective removal
Updated get_links_local(), add_link(), remove_link() to use SurrealDB

Migration System (`src/perspectives/migration.rs` ⭐ NEW)

Self-contained one-time migration module:

Loads existing links from Rusqlite
Saves to file-based SurrealDB
Marks perspective as migrated (idempotent)
Removes old data from Rusqlite
Designed for easy removal in future versions

Database Support (`src/db.rs`)

Added migration tracking:

New table: perspective_link_migration
Methods: is_perspective_migrated(), mark_perspective_as_migrated(), delete_all_links_for_perspective()

On first startup after upgrade:

App automatically migrates link data from Rusqlite to SurrealDB
Migration runs once per perspective (tracked in database)
Old link data removed from Rusqlite after successful migration
Process is transparent - no user action required

DB File Structure

{app_data_path}/
└── surrealdb_perspectives/
├── {perspective-uuid-1}/ # RocksDB database files
├── {perspective-uuid-2}/
└── {perspective-uuid-3}/

Future Cleanup

The migration code (src/perspectives/migration.rs) can be removed after all users have upgraded (recommended: 2-3 releases from now). See MIGRATION_REMOVAL_GUIDE.md for removal instructions.

Summary by CodeRabbit

New Features
- Optional per-perspective disk-backed storage and seamless per-perspective migration with migration tracking and cleanup.
Bug Fixes
- Stronger error propagation and retry logic for persistence operations; idempotent migration flow and safer initialization to reduce races and duplicate work.
Documentation
- Added a migration removal guide and a fixes summary to assist operators with transition and verification.

coderabbitai · 2026-01-28T22:24:42Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds per‑perspective SurrealDB persistence with optional RocksDB backend, a one‑time idempotent migration from Rusqlite to SurrealDB, migration tracking and cleanup APIs, rewrites perspective read/write paths to use SurrealDB, and enables the kv-rocksdb Cargo feature for surrealdb.

Changes

Cohort / File(s)	Summary
Dependency & Build `rust-executor/Cargo.toml`	Enabled `kv-rocksdb` feature for the `surrealdb` dependency (added alongside `kv-mem` and `scripting`).
Migration Guide `rust-executor/MIGRATION_REMOVAL_GUIDE.md`	New guide describing when/how to remove the one‑time Rusqlite→SurrealDB migration code and verification steps.
DB Migration Tracking `rust-executor/src/db.rs`	Added `perspective_link_migration` table and APIs: `is_perspective_migrated`, `mark_perspective_as_migrated`, `delete_all_links_for_perspective`.
App Init Hook `rust-executor/src/lib.rs`	Sets global perspectives app data path and calls `perspectives::initialize_from_db()` during startup.
Migration Module `rust-executor/src/perspectives/migration.rs`	New one‑time migration module with `migrate_links_from_rusqlite_to_surrealdb(...)` (idempotent), migration flow, and unit tests.
Perspectives API & Lifecycle `rust-executor/src/perspectives/mod.rs`	Added `pub mod migration` and `pub fn set_app_data_path(String)`; initialize/add/remove/import paths now create per‑perspective SurrealDB instances, trigger migrations, and manage file-backed storage.
Perspective Instance Logic `rust-executor/src/perspectives/perspective_instance.rs`	Replaced direct Rusqlite reads/writes with SurrealDB operations; added `persist_link_diff`; changed `retry_surreal_op` to return `Result` and updated call sites.
SurrealDB Service & Backend `rust-executor/src/surreal_service/mod.rs`	Constructor now accepts `data_path` to enable RocksDB-backed storage; added `SurrealLink`↔`DecoratedLinkExpression` mapping, query helpers (`get_all_links`, `get_link`, `get_links_by_source/target/predicate`), uniqueness handling and conflict‑retry logic.
Notes & Fix Summaries `CODERABBIT_FIXES.md`, `PR_652_FIXES_SUMMARY.md`	Added error‑handling fixes, migration/TOCTOU notes, list of addressed issues and commit summaries.

Sequence Diagram(s)

sequenceDiagram
    participant Perspectives as Perspectives Module
    participant Rusqlite as Ad4mDb (Rusqlite)
    participant Migration as Migration Module
    participant Surreal as SurrealDB Service
    participant Rocks as RocksDB

    Perspectives->>Surreal: new(namespace, db, data_path)
    Surreal->>Rocks: Initialize RocksDB (if data_path)
    Perspectives->>Migration: migrate_links_from_rusqlite_to_surrealdb(perspective_uuid)
    Migration->>Rusqlite: is_perspective_migrated(perspective_uuid)?
    alt already migrated
        Migration-->>Perspectives: return (idempotent)
    else
        Migration->>Rusqlite: load all links(perspective_uuid)
        loop per link
            Migration->>Migration: convert to DecoratedLinkExpression
            Migration->>Surreal: add link -> persist
            Surreal->>Rocks: persist edge (if RocksDB)
        end
        Migration->>Rusqlite: mark_perspective_as_migrated(perspective_uuid)
        Migration->>Rusqlite: delete_all_links_for_perspective(perspective_uuid)
        Migration-->>Perspectives: return success
    end

sequenceDiagram
    participant Client
    participant PI as PerspectiveInstance
    participant Surreal as SurrealDB Service
    participant Rocks as RocksDB

    Client->>PI: add_link_expression(link)
    PI->>PI: build DecoratedPerspectiveDiff
    PI->>PI: persist_link_diff(diff)
    PI->>Surreal: remove_links(matching old)
    Surreal->>Rocks: delete edges
    PI->>Surreal: add DecoratedLinkExpressions
    Surreal->>Rocks: insert edges
    PI->>PI: update Prolog/pubsub
    PI-->>Client: return result

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Refactor: Use Separate SurrealDB Database per Perspective #636: Related changes to SurrealDBService initialization and per‑perspective DB handling (constructor signature and wiring).
SurrealDB Query Engine Integration for blazing fast Ad4mModel operations #632: Related SurrealDB integration and Cargo feature updates enabling RocksDB backend.
Memory Optimizations #644: Overlapping changes to SurrealDB persistence, perspective lifecycle, and migration wiring.

Poem

🐰✨ Hopping through stores both new and old,

I carry links in paws of gold.
Rusqlite to Surreal, one‑time, neat—
RocksDB hums beneath my feet.
Migration finished — a tidy feat.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main architectural change—switching SurrealDB storage from in-memory to file-based RocksDB—which is the core objective of this PR.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch surreal-files

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

rust-executor/src/perspectives/perspective_instance.rs (1)

2425-2457: Propagate SurrealDB write failures to callers.
persist_link_diff/retry_surreal_op currently log warnings and silently continue on non-retriable errors. With SurrealDB as the primary store, this risks desynchronization between persistence and Prolog/pubsub state. Change both to return Result and propagate failures.

Most callsites (remove_link, add_link_expression, add_links, link_mutations, update_link, commit_pending_diffs) already return Result<_, AnyError> and can propagate with ?. For diff_from_link_language, which returns (), use .await.expect() to fail-fast on critical synchronization operations.

🛠️ Suggested fix

-    async fn retry_surreal_op<F, Fut>(op: F, uuid: &str, op_name: &str)
+    async fn retry_surreal_op<F, Fut>(
+        op: F,
+        uuid: &str,
+        op_name: &str,
+    ) -> Result<(), anyhow::Error>
     where
         F: Fn() -> Fut,
         Fut: std::future::Future<Output = Result<(), anyhow::Error>>,
     {
         let mut attempts = 0;
         let max_attempts = 5;
         loop {
             match op().await {
-                Ok(_) => break,
+                Ok(_) => return Ok(()),
                 Err(e) => {
                     let msg = format!("{}", e);
                     if msg.contains("Failed to commit transaction due to a read or write conflict")
                         && attempts < max_attempts
                     {
                         attempts += 1;
                         tokio::time::sleep(std::time::Duration::from_millis(100 * attempts)).await;
                         continue;
                     } else {
                         log::warn!(
                             "Failed to {} link in SurrealDB for perspective {}: {:?}",
                             op_name,
                             uuid,
                             e
                         );
-                        break;
+                        return Err(e);
                     }
                 }
             }
         }
     }

-pub(crate) async fn persist_link_diff(&self, diff: &DecoratedPerspectiveDiff) {
+pub(crate) async fn persist_link_diff(
+    &self,
+    diff: &DecoratedPerspectiveDiff,
+) -> Result<(), AnyError> {
     // Get UUID
     let uuid = {
         let persisted_guard = self.persisted.lock().await;
         persisted_guard.uuid.clone()
     };
@@
         for removal in &diff.removals {
             Self::retry_surreal_op(
                 || self.surreal_service.remove_link(&uuid, removal),
                 &uuid,
                 "remove",
             )
-            .await;
+            .await?;
         }
         // Additions after
         for addition in &diff.additions {
             Self::retry_surreal_op(
                 || self.surreal_service.add_link(&uuid, addition),
                 &uuid,
                 "add",
             )
-            .await;
+            .await?;
         }
+        Ok(())
     }

Then update callsites:

Functions returning Result: change .await; to .await?;
diff_from_link_language: change .await; to .await.expect("SurrealDB link persistence failed");

🤖 Fix all issues with AI agents

In `@rust-executor/MIGRATION_REMOVAL_GUIDE.md`:
- Around line 112-113: Update the heading text "Can optionally be cleaned up:"
in MIGRATION_REMOVAL_GUIDE.md to read "The following can optionally be cleaned
up:" so it becomes a complete sentence; locate the existing heading string (the
line containing "Can optionally be cleaned up:") and replace it with the
proposed wording, leaving the subsequent list (e.g., "- `src/db.rs` - Migration
tracking methods (optional to remove)") unchanged.

In `@rust-executor/src/perspectives/migration.rs`:
- Around line 97-119: The migration currently marks the perspective as migrated
and proceeds to delete links even when some Link migrations failed (tracked by
error_count), risking data loss; update the logic in the function that runs the
loop (the block using migrated_count and error_count and calling
Ad4mDb::with_global_instance(...).mark_perspective_as_migrated) to check
error_count after the migration loop and if error_count > 0 return an Err (e.g.,
Err(format!(...)) converted to the function's error type) instead of calling
Ad4mDb::with_global_instance(...).mark_perspective_as_migrated; alternatively,
if you prefer partial success, only delete links you recorded as successfully
migrated (use migrated_count and whatever collection tracks successful link IDs)
and only mark the perspective as migrated when error_count == 0, referencing
error_count, migrated_count, and
Ad4mDb::with_global_instance/mark_perspective_as_migrated to locate where to
change the control flow.

In `@rust-executor/src/perspectives/mod.rs`:
- Around line 441-457: The code currently maps instance.links into
DecoratedLinkExpression always using LinkStatus::Local; instead preserve each
link's original status by reading the link's status field
(LinkExpression.status) and passing that through to DecoratedLinkExpression (or
convert Option<LinkStatus> into the expected status enum), falling back to
LinkStatus::Local only when the original status is None; update the mapping in
the instance.links.into_iter().map(...) call (where
DecoratedLinkExpression::from((link, LinkStatus::Local)) is used) so it
constructs the decorated link with the actual link.status instead of hardcoding
LinkStatus::Local before calling perspective.persist_link_diff(&diff).await.
- Around line 56-66: The current read-then-release check on PERSPECTIVES allows
a TOCTOU race where multiple tasks calling initialize_from_db can each pass the
contains_key check and create duplicate SurrealDB services; change this to an
atomic operation by taking a write lock and inserting a placeholder entry (e.g.,
a "pending" marker) for handle_clone.uuid before spawning the async
initialization, or perform an entry-or-init pattern that checks and inserts
under the same write lock so only one initializer proceeds; ensure the eventual
successful initialization replaces the placeholder and failures remove it so
initialize_from_db (and the GraphQL import mutation) cannot start duplicate
initializations.

In `@rust-executor/src/perspectives/perspective_instance.rs`:
- Around line 878-891: The SurrealDB lookup is too broad: update the lookup to
honor the full unique constraint (source, target, predicate, author, timestamp)
instead of only (source, target, predicate). Modify surreal_service::get_link
(or the callsites remove_link, update_link, add_shared_links) so it accepts/uses
author and timestamp from the existing LinkExpression/old_link/link parameters
and queries/matches all five fields (link_unique_idx) or filters fetched results
by author+timestamp before returning; ensure get_link no longer returns an
arbitrary LIMIT 1 row when multiple authors/timestamps exist and that callers
(e.g., the code creating decorated_link/LinkExpression) rely on the corrected
lookup.

In `@rust-executor/src/surreal_service/mod.rs`:
- Around line 532-576: The error handling in the add_link flow currently treats
any error string containing "index" as a benign unique-constraint issue; update
the match branch that inspects error_string (the Err(e) arm where result is
matched) to remove the check for error_string.contains("index") so only checks
for "unique", "duplicate", and "already exists" remain; this ensures genuine
index/schema errors are propagated instead of being swallowed while keeping
idempotent handling for duplicate-link errors.

🧹 Nitpick comments (2)

rust-executor/src/perspectives/mod.rs (1)

80-93: Migration failure logging is appropriate but consider user visibility.

Migration failures are logged at error level but the perspective still initializes. This is defensive, but users won't see these errors in a UI. Consider whether a more visible notification mechanism is needed for production.

rust-executor/src/surreal_service/mod.rs (1)

756-904: Consider DRYing the SurrealLink deserialization path

The JSON unwrap + SurrealLink mapping is repeated across the getters; a small helper would reduce drift.

rust-executor/MIGRATION_REMOVAL_GUIDE.md

rust-executor/src/perspectives/migration.rs

rust-executor/src/perspectives/mod.rs

rust-executor/src/perspectives/perspective_instance.rs

rust-executor/src/surreal_service/mod.rs

- MIGRATION_REMOVAL_GUIDE.md: Complete sentence in heading - migration.rs: Only mark as migrated when error_count == 0 (prevents data loss) - surreal_service/mod.rs: Remove overly broad 'index' error check (more precise error handling) Addresses CodeRabbit actionable comments on PR #652

) Instead of hardcoding LinkStatus::Local, now reads link.status and uses it (falls back to Local if None). This preserves the original link status during import operations. Addresses CodeRabbit actionable comment on PR #652

- retry_surreal_op now returns Result and propagates errors - persist_link_diff now returns Result instead of silently swallowing errors - Updated all callsites: - Functions returning Result: use .await? to propagate - Functions returning (): use .await.expect() to fail-fast - Critical synchronization operations now fail loudly instead of silently Addresses CodeRabbit actionable comment #1 on PR #652

Updated get_link to accept and use author and timestamp parameters: - get_link now takes optional author and timestamp - When provided, queries using all 5 unique fields (source, target, predicate, author, timestamp) - When not provided, falls back to 3-field lookup for backward compatibility - Updated all callsites to pass author and timestamp from LinkExpression This prevents returning arbitrary links when multiple authors/timestamps exist for the same source/predicate/target combination. Addresses CodeRabbit actionable comment #6 on PR #652

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@rust-executor/src/surreal_service/mod.rs`:
- Around line 792-829: The get_link implementation only filters by
source/target/predicate but the unique key includes author and timestamp, so
modify the get_link signature to accept author: &str and timestamp: &str (or
appropriate types), update the SQL query string to add AND author = $author AND
timestamp = $timestamp and bind those values (update parameter variables like
predicate_str, source_owned, target_owned to include author_owned and
timestamp_owned), and return the same SurrealLink conversion; then update all
call sites that invoke get_link (the four places in perspective_instance.rs that
pass a link_expression) to supply &link_expression.author and
&link_expression.timestamp.

🧹 Nitpick comments (1)

rust-executor/src/perspectives/mod.rs (1)

303-318: Use async file operations to avoid blocking the executor.

std::fs::remove_dir_all is a blocking call within an async function. This can block the tokio executor thread, potentially starving other tasks. Consider using tokio::fs::remove_dir_all for non-blocking directory removal.

♻️ Proposed fix to use async file operations

         // Clean up RocksDB directory for this perspective
         if let Some(data_path) = get_app_data_path() {
             let db_path =
                 std::path::Path::new(&data_path).join(format!("surrealdb_perspectives/{}", uuid));
             if db_path.exists() {
-                if let Err(e) = std::fs::remove_dir_all(&db_path) {
+                if let Err(e) = tokio::fs::remove_dir_all(&db_path).await {
                     log::warn!(
                         "Failed to remove SurrealDB directory for perspective {}: {}",
                         uuid,
                         e
                     );
                 } else {
                     log::debug!("Cleaned up SurrealDB directory for perspective {}", uuid);
                 }
             }
         }

rust-executor/src/surreal_service/mod.rs

Added atomic check-and-insert before storing perspective: - Initial read-lock check remains for quick filtering - After async initialization completes, do final write-lock check - Only insert if another task hasn't already initialized this perspective - Discard duplicate work and don't start background tasks if race lost This prevents multiple tasks from creating duplicate SurrealDB services for the same perspective UUID. Addresses CodeRabbit actionable comment #5 on PR #652

Compilation error: link.status.unwrap_or() moves the value, preventing use of 'link' afterwards. Use clone() to avoid the partial move.

Changed 'for (link_expr, status) in links' to '&links' and cloned status to avoid moving values out of the vector.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

rust-executor/src/surreal_service/mod.rs (1)
1655-1656: Test queries reference non-existent perspective field.

Several tests query WHERE perspective = $perspective, but the link table schema (lines 379-387) doesn't include a perspective field. Per the retrieved learnings, each perspective has its own isolated database, so no perspective filtering is needed.

These tests will likely fail or return no results:

Line 1655: SELECT * FROM link WHERE perspective = $perspective ORDER BY timestamp

Line 1674: graph query with perspective filter

Line 1800, 1816, 1832, etc.
🐛 Fix the test queries
-        let query = "SELECT * FROM link WHERE perspective = $perspective ORDER BY timestamp";
+        let query = "SELECT * FROM link ORDER BY timestamp";
And similarly for other occurrences in test_reload_perspective_data_integrity and test_reload_perspective_with_large_dataset.

🤖 Fix all issues with AI agents

In `@rust-executor/src/perspectives/perspective_instance.rs`:
- Around line 539-548: The code calls
self.surreal_service.get_all_links(&uuid).await.unwrap() inside the
link_language_clone branch which will panic if SurrealDB is unreachable; change
this to handle the Result by matching or using if let Err(e) to log the error
(use the existing logger or processLogger-like facility) and return false so the
fallback sync loop can retry; on success, proceed to map decorated_links into
local_links as before (refer to get_all_links, decorated_links, local_links,
LinkExpression, LinkStatus, and link_language_clone).

🧹 Nitpick comments (8)

rust-executor/src/perspectives/mod.rs (3)

154-157: Consider propagating SurrealDB creation errors instead of panicking.

Using .expect() on line 157 will crash the entire application if SurrealDB service creation fails (e.g., due to disk space, permissions, or RocksDB lock issues). Since add_perspective already returns Result<(), String>, consider propagating this error:

♻️ Proposed fix to propagate error

     let surreal_service =
-        crate::surreal_service::SurrealDBService::new("ad4m", &handle.uuid, data_path.as_deref())
+        crate::surreal_service::SurrealDBService::new("ad4m", &handle.uuid, data_path.as_deref())
             .await
-            .expect("Failed to create SurrealDB service for perspective");
+            .map_err(|e| format!("Failed to create SurrealDB service: {}", e))?;

315-330: Blocking filesystem operation in async context.

std::fs::remove_dir_all is a synchronous operation that could block the async executor, especially if the directory contains many files or is on a slow filesystem. Consider using tokio::fs::remove_dir_all for non-blocking cleanup:

♻️ Proposed fix using async filesystem operations

         // Clean up RocksDB directory for this perspective
         if let Some(data_path) = get_app_data_path() {
             let db_path =
                 std::path::Path::new(&data_path).join(format!("surrealdb_perspectives/{}", uuid));
             if db_path.exists() {
-                if let Err(e) = std::fs::remove_dir_all(&db_path) {
+                if let Err(e) = tokio::fs::remove_dir_all(&db_path).await {
                     log::warn!(
                         "Failed to remove SurrealDB directory for perspective {}: {}",
                         uuid,
                         e
                     );
                 } else {
                     log::debug!("Cleaned up SurrealDB directory for perspective {}", uuid);
                 }
             }
         }

84-90: Inconsistent error handling: SurrealDB creation panics vs migration logs.

At line 90, SurrealDB creation failure causes a panic (aborting the spawned task), while migration failures at lines 100-105 are logged and execution continues. This inconsistency means:

A RocksDB lock failure → panic, perspective silently doesn't initialize
A migration failure → logged, perspective initializes anyway

Consider using consistent error handling:

♻️ Proposed fix for consistent error handling

-            let surreal_service = crate::surreal_service::SurrealDBService::new(
+            let surreal_service = match crate::surreal_service::SurrealDBService::new(
                 "ad4m",
                 &handle_clone.uuid,
                 data_path.as_deref(),
             )
             .await
-            .expect("Failed to create SurrealDB service for perspective");
+            {
+                Ok(service) => service,
+                Err(e) => {
+                    log::error!(
+                        "Failed to create SurrealDB service for perspective {}: {}",
+                        handle_clone.uuid,
+                        e
+                    );
+                    return;
+                }
+            };

rust-executor/src/surreal_service/mod.rs (5)

578-609: Consider avoiding node creation on remove.

The remove_link method calls ensure_node for both source and target, which will create nodes if they don't exist. While harmless, this is slightly wasteful when deleting a non-existent link.

You could compute the node IDs without creating them:

♻️ Optional optimization

     pub async fn remove_link(
         &self,
         _perspective_uuid: &str,
         link: &DecoratedLinkExpression,
     ) -> Result<(), Error> {
-        // Get node IDs for source and target
-        let source_id = self.ensure_node(&link.data.source).await?;
-        let target_id = self.ensure_node(&link.data.target).await?;
+        // Compute node IDs without creating nodes
+        use sha2::{Digest, Sha256};
+        
+        let mut hasher = Sha256::new();
+        hasher.update(link.data.source.as_bytes());
+        let source_hash = format!("{:x}", hasher.finalize());
+        let source_id = format!("node:{}", &source_hash[..16]);
+        
+        let mut hasher = Sha256::new();
+        hasher.update(link.data.target.as_bytes());
+        let target_hash = format!("{:x}", hasher.finalize());
+        let target_id = format!("node:{}", &target_hash[..16]);

807-836: Confusing variable naming and redundant assignments.

The variable query is assigned a Response (not a query string), then reassigned through results and response. This obscures the code flow.

♻️ Suggested simplification

     let query = if let (Some(author_str), Some(timestamp_str)) = (author, timestamp) {
         // Full unique constraint lookup (all 5 fields)
         let author_owned = author_str.to_string();
         let timestamp_owned = timestamp_str.to_string();
         
-        let results = self
+        self
             .db
             .query("SELECT * FROM link WHERE source = $source AND target = $target AND predicate = $predicate AND author = $author AND timestamp = $timestamp LIMIT 1")
             .bind(("source", source_owned))
             .bind(("target", target_owned))
             .bind(("predicate", predicate_str))
             .bind(("author", author_owned))
             .bind(("timestamp", timestamp_owned))
-            .await?;
-        results
+            .await?
     } else {
         // Backward compatible lookup (only source/predicate/target)
-        let results = self
+        self
             .db
             .query("SELECT * FROM link WHERE source = $source AND target = $target AND predicate = $predicate LIMIT 1")
             .bind(("source", source_owned))
             .bind(("target", target_owned))
             .bind(("predicate", predicate_str))
-            .await?;
-        results
+            .await?
     };
 
-    let results = query;
-
-    let mut response = results;
+    let mut response = query;
     let result: SurrealValue = response.take(0)?;

863-967: Consider extracting common query result processing.

The three methods get_links_by_source, get_links_by_target, and get_links_by_predicate share identical result processing logic (lines 872-888, 909-925, 950-966). This could be extracted into a helper method.

♻️ Optional refactor to reduce duplication

/// Helper to process query results into DecoratedLinkExpression vec
async fn process_link_query_results(
    &self,
    mut response: surrealdb::Response,
) -> Result<Vec<DecoratedLinkExpression>, Error> {
    let result: SurrealValue = response.take(0)?;
    let json_string = serde_json::to_string(&result)?;
    let json_value: Value = serde_json::from_str(&json_string)?;
    let unwrapped = unwrap_surreal_json(json_value);

    if let Value::Array(arr) = unwrapped {
        let links: Vec<DecoratedLinkExpression> = arr
            .into_iter()
            .filter_map(|value| serde_json::from_value::<SurrealLink>(value).ok())
            .map(|surreal_link| surreal_link.into())
            .collect();
        return Ok(links);
    }
    Ok(vec![])
}

770-775: Silent deserialization failures may hide data issues.

The filter_map on line 773 silently drops any rows that fail to deserialize as SurrealLink. While this provides resilience, it could mask data corruption or schema mismatches.

Consider logging a warning when deserialization fails:

🛡️ Optional defensive logging

     let links: Vec<DecoratedLinkExpression> = results
         .into_iter()
-        .filter_map(|value| serde_json::from_value::<SurrealLink>(value).ok())
+        .filter_map(|value| {
+            match serde_json::from_value::<SurrealLink>(value.clone()) {
+                Ok(link) => Some(link),
+                Err(e) => {
+                    log::warn!("Failed to deserialize link from SurrealDB: {:?}, value: {:?}", e, value);
+                    None
+                }
+            }
+        })
         .map(|surreal_link| surreal_link.into())
         .collect();

2138-2174: Add integration test for RocksDB transaction conflict handling.

The test_concurrent_operations test uses in-memory storage (data_path: None), so it doesn't exercise the RocksDB transaction conflict retry logic in ensure_node (which implements exponential backoff for transaction conflicts). Consider adding an integration test with a temporary directory to verify RocksDB concurrent write behavior is handled correctly.

rust-executor/src/perspectives/perspective_instance.rs

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

rust-executor/src/surreal_service/mod.rs (1)
1655-1657: Test queries reference non-existent perspective field.

The query at line 1655 uses WHERE perspective = $perspective, but the schema (lines 379-387) does not define a perspective field. Each perspective has its own isolated database, so no perspective field is needed or exists. These queries will either error or return empty results.

This affects multiple tests:

Lines 1655, 1674: test_reload_perspective_data_integrity

Lines 1799-1870: graph traversal queries in same test

Lines 1902, 1984-2027: test_reload_perspective_with_large_dataset
🔧 Suggested fix for test queries

Remove the perspective = $perspective filter since each perspective has its own isolated database:
-        let query = "SELECT * FROM link WHERE perspective = $perspective ORDER BY timestamp";
+        let query = "SELECT * FROM link ORDER BY timestamp";
Apply similar fixes to all affected queries in these tests.

🤖 Fix all issues with AI agents

In `@rust-executor/src/perspectives/migration.rs`:
- Around line 153-188: The tests mutate the global Ad4mDb singleton via
Ad4mDb::init_global_instance() causing race conditions; change setup() to
acquire and return a serialization guard (e.g., lock a static Mutex or use
once_cell::sync::Lazy<Mutex<()>> and return the MutexGuard) and then in
test_migration_tracking() call let _guard = setup(); at the start (and apply the
same pattern to other tests that touch Ad4mDb) so the global singleton is
accessed serially across tests while holding the guard for the test duration.

In `@rust-executor/src/perspectives/mod.rs`:
- Around line 92-105: The migration error is only logged but the perspective
initialization continues; change the flow in the block calling
migration::migrate_links_from_rusqlite_to_surrealdb so that if it returns Err
you stop initialization and propagate the error (e.g., return Err) instead of
just logging—use the same context (handle_clone.uuid, surreal_service) and
ensure the caller receives the failure so the perspective init is skipped when
migration fails.

🧹 Nitpick comments (1)

rust-executor/src/surreal_service/mod.rs (1)

834-836: Minor: Redundant variable reassignments.

These lines perform unnecessary reassignments. The query variable is immediately assigned to results, then results to response.

♻️ Suggested simplification

-        let query = if let (Some(author_str), Some(timestamp_str)) = (author, timestamp) {
+        let mut response = if let (Some(author_str), Some(timestamp_str)) = (author, timestamp) {
             // Full unique constraint lookup (all 5 fields)
             let author_owned = author_str.to_string();
             let timestamp_owned = timestamp_str.to_string();

             let results = self
                 .db
                 .query("SELECT * FROM link WHERE source = $source AND target = $target AND predicate = $predicate AND author = $author AND timestamp = $timestamp LIMIT 1")
                 .bind(("source", source_owned))
                 .bind(("target", target_owned))
                 .bind(("predicate", predicate_str))
                 .bind(("author", author_owned))
                 .bind(("timestamp", timestamp_owned))
                 .await?;
             results
         } else {
             // Backward compatible lookup (only source/predicate/target)
             let results = self
                 .db
                 .query("SELECT * FROM link WHERE source = $source AND target = $target AND predicate = $predicate LIMIT 1")
                 .bind(("source", source_owned))
                 .bind(("target", target_owned))
                 .bind(("predicate", predicate_str))
                 .await?;
             results
         };

-        let results = query;
-
-        let mut response = results;
         let result: SurrealValue = response.take(0)?;

coderabbitai · 2026-01-29T00:10:43Z

rust-executor/src/perspectives/migration.rs

+    fn setup() {
+        Ad4mDb::init_global_instance(":memory:").unwrap();
+    }
+
+    #[tokio::test]
+    async fn test_migration_tracking() {
+        setup();
+
+        let perspective_uuid = "test-migration-uuid";
+
+        // Test that perspective is not migrated initially
+        let is_migrated = Ad4mDb::with_global_instance(|db| {
+            db.is_perspective_migrated(perspective_uuid)
+                .expect("Failed to check migration status")
+        });
+        assert!(!is_migrated, "Perspective should not be migrated initially");
+
+        // Mark perspective as migrated
+        Ad4mDb::with_global_instance(|db| {
+            db.mark_perspective_as_migrated(perspective_uuid)
+                .expect("Failed to mark perspective as migrated")
+        });
+
+        // Test that perspective is now migrated
+        let is_migrated = Ad4mDb::with_global_instance(|db| {
+            db.is_perspective_migrated(perspective_uuid)
+                .expect("Failed to check migration status")
+        });
+        assert!(is_migrated, "Perspective should be migrated");
+
+        // Test idempotency - marking again should not fail
+        Ad4mDb::with_global_instance(|db| {
+            db.mark_perspective_as_migrated(perspective_uuid)
+                .expect("Failed to mark perspective as migrated (idempotent)")
+        });
+


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

cargo test parallel execution default behavior

💡 Result:

By default, cargo test runs your Rust tests in parallel (concurrently) using the Rust test harness (libtest). The default number of test threads is based on the machine’s available hardware concurrency (Rust uses available_parallelism). [1]

You can control this parallelism by passing an argument to the test binary:

cargo test -- --test-threads=1

or via the environment variable (noted as deprecated in some docs, but still widely used):

RUST_TEST_THREADS=1 cargo test

Also note: Cargo’s -j/--jobs only affects build parallelism, not how many threads run tests; test threading is controlled by --test-threads. [2]

Sources: Rust test execution options (--test-threads) [1], Cargo cargo test docs on --jobs vs --test-threads [2].

🌐 Web query:

Rust global singleton tests thread safety race conditions

💡 Result:

Rust tests run in parallel by default, so any global singleton / shared global state can cause data races or flaky tests unless it’s explicitly thread-safe (or you force tests to run serially). The Rust Book warns that parallel tests must not depend on shared state, and shows running with one thread via cargo test -- --test-threads=1. [1]

What usually goes wrong

Using static mut + ad-hoc “init flags” is race-prone (two threads can observe “not initialized” and both initialize). A classic fix is to use std::sync::Once for one-time init. [5]

Trying to put std::cell::OnceCell<T> in a static fails because a static must be Sync, and OnceCell is not thread-safe. [2][4]

Recommended singleton patterns (thread-safe)

1) std::sync::OnceLock (manual/lazy one-time init)

OnceLock<T> is the thread-safe equivalent intended for statics. [3]
Typical pattern:

static GLOBAL: OnceLock<T> = OnceLock::new();

In code/tests: GLOBAL.get_or_init(|| make_t())

This prevents double-initialization races across parallel tests. [3]

2) std::sync::Once (run an init function exactly once)

If you need “call init() once” semantics (often in test setups), Once is the standard tool for that. [5]

If tests still interfere

Either make the shared global state fully thread-safe and isolated/resettable, or

Run tests serially: cargo test -- --test-threads=1 (slower, but avoids parallel interference). [1]

Sources: Rust Book on parallel tests and --test-threads. [1] Rust docs for OnceLock (thread-safe, usable in statics) and OnceCell (non-thread-safe; points to OnceLock). [3][4] Rust forum explanation that static requires Sync and OnceCell isn’t. [2] Stack Overflow example recommending Once to avoid duplicate init races in tests. [5]

Serialize tests that touch the global Ad4mDb singleton.
cargo test runs tests in parallel by default, and Ad4mDb::init_global_instance() mutates a global singleton, causing inter-test races and flaky tests. Serialize access using a mutex guard.

🧪 Suggested guard for test serialization

mod tests { use super::*; use crate::graphql::graphql_types::LinkStatus; use crate::types::{ExpressionProof, Link, LinkExpression}; use chrono::Utc; + use lazy_static::lazy_static; + use std::sync::{Mutex, MutexGuard}; - fn setup() { - Ad4mDb::init_global_instance(":memory:").unwrap(); - } + lazy_static! { + static ref TEST_MUTEX: Mutex<()> = Mutex::new(()); + } + + fn setup() -> MutexGuard<'static, ()> { + let guard = TEST_MUTEX.lock().unwrap(); + Ad4mDb::init_global_instance(":memory:").unwrap(); + guard + } #[tokio::test] async fn test_migration_tracking() { - setup(); + let _guard = setup();

Apply let _guard = setup(); to all other tests as well.

🤖 Prompt for AI Agents

In `@rust-executor/src/perspectives/migration.rs` around lines 153 - 188, The tests mutate the global Ad4mDb singleton via Ad4mDb::init_global_instance() causing race conditions; change setup() to acquire and return a serialization guard (e.g., lock a static Mutex or use once_cell::sync::Lazy<Mutex<()>> and return the MutexGuard) and then in test_migration_tracking() call let _guard = setup(); at the start (and apply the same pattern to other tests that touch Ad4mDb) so the global singleton is accessed serially across tests while holding the guard for the test duration.

rust-executor/src/perspectives/mod.rs

Addresses CodeRabbit feedback: simplified variable flow by directly assigning query result to response instead of going through intermediate query and results variables. Co-authored-by: CodeRabbit AI <coderabbit@example.com>

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

rust-executor/src/surreal_service/mod.rs (1)

619-624: Remove embedded perspective = $perspective filters from test queries.

Tests at lines 1651, 1670, 1796, 1812, 1828, 1840, 1853, 1898, 1981, 1998, and 2015 embed WHERE perspective = $perspective, but query_links() does not bind this variable (the _perspective_uuid parameter is unused). These queries will fail with unbound variable errors at runtime. Update queries to remove perspective field filtering, relying instead on the isolated per-perspective database context, as demonstrated in test_query_without_perspective_binding() at line 2065.

🤖 Fix all issues with AI agents

In `@rust-executor/src/surreal_service/mod.rs`:
- Around line 27-33: The status parsing in the From<SurrealLink> for
DecoratedLinkExpression implementation drops values when SurrealLink.status is
lowercase; update the match in the from(surreal_link: SurrealLink) function to
do a case-insensitive comparison (e.g., convert s to lowercase or use
eq_ignore_ascii_case) so "Shared"/"shared" and "Local"/"local" both map to
LinkStatus::Shared and LinkStatus::Local respectively, preserving migrated data.
- Around line 794-848: get_link currently accepts Option<&str> for author and
timestamp and has a fallback branch that omits them; change get_link to require
author: &str and timestamp: &str (remove the Option types), remove the
conditional/fallback branch and always run the single SELECT that binds source,
target, predicate, author and timestamp (use the existing full-constraint query
string), and eliminate the author/timestamp cloning branch (remove the tuple
match and related else branch). Update all call sites of get_link to pass the
author and timestamp strings and adjust any tests/mocks accordingly so the
UNIQUE index (in, out, predicate, author, timestamp) is always respected.

🧹 Nitpick comments (2)

rust-executor/src/surreal_service/mod.rs (2)
583-606: Avoid creating nodes on delete to prevent orphan growth.

remove_link() calls ensure_node(), which will create nodes even when deleting a non-existent link. With persistent RocksDB, this can leave orphan nodes and bloat storage. Consider deleting by source/target string fields instead of in/out when no link exists.
♻️ Possible refactor
-        let source_id = self.ensure_node(&link.data.source).await?;
-        let target_id = self.ensure_node(&link.data.target).await?;
+        let source = link.data.source.clone();
+        let target = link.data.target.clone();

         self.db
             .query(
                 "DELETE FROM link WHERE
-                in = type::thing($source_id) AND
-                out = type::thing($target_id) AND
+                source = $source AND
+                target = $target AND
                 predicate = $predicate AND
                 author = $author AND
                 timestamp = $timestamp",
             )
-            .bind(("source_id", source_id))
-            .bind(("target_id", target_id))
+            .bind(("source", source))
+            .bind(("target", target))
755-963: Deduplicate SurrealDB → DecoratedLinkExpression conversion.

get_all_links, get_links_by_source, get_links_by_target, and get_links_by_predicate repeat the same unwrap/deserialize logic. Extract a helper (or reuse query_links) to reduce duplication and keep parsing consistent.

rust-executor/src/surreal_service/mod.rs

1. Remove 'WHERE perspective = $perspective' from test queries - Each perspective has isolated database, no filtering needed - Fixed 11 test queries in surreal_service/mod.rs 2. Make status parsing case-insensitive in SurrealLink conversion - Now handles 'Shared'/'shared' and 'Local'/'local' correctly - Preserves migrated data regardless of case 3. Require author/timestamp in get_link() signature - Changed from Option<&str> to &str for both params - Removed fallback branch (always use full unique constraint) - Updated 4 callsites in perspective_instance.rs - Enforces UNIQUE index (in, out, predicate, author, timestamp) Co-authored-by: CodeRabbit AI <coderabbit@example.com>

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

rust-executor/src/perspectives/perspective_instance.rs (1)
4174-4216: Persist to SurrealDB before updating Prolog to keep source-of-truth consistent.

Right now Line 4212 updates Prolog before Line 4214 persists. If SurrealDB persistence fails, Prolog/pubsub can diverge from the source of truth. Align this with the other write paths by persisting first.
✅ Suggested fix
-            self.update_prolog_engines(combined_diff.clone()).await;
-
-            self.persist_link_diff(&combined_diff)
-                .await
-                .expect("SurrealDB link persistence failed - critical synchronization error");
+            self.persist_link_diff(&combined_diff)
+                .await
+                .expect("SurrealDB link persistence failed - critical synchronization error");
+
+            self.update_prolog_engines(combined_diff.clone()).await;

🤖 Fix all issues with AI agents

In `@rust-executor/src/surreal_service/mod.rs`:
- Around line 782-804: The docs for get_link are stale: they claim author and
timestamp are optional but the signature accepts &str; update the comment for
the get_link method to remove/replace wording about optional author/timestamp
and the behavior that they are matched only when present, and instead clearly
state that author: &str and timestamp: &str are required string parameters that
must be provided and are used in the query; reference the get_link function and
its parameters (source, predicate, target, author, timestamp) when editing the
docblock.

🧹 Nitpick comments (1)

rust-executor/src/surreal_service/mod.rs (1)

847-951: Consider extracting common result processing into a helper.

The get_links_by_source, get_links_by_target, and get_links_by_predicate methods share identical result processing logic (JSON conversion, unwrapping, deserialization to SurrealLink, and mapping to DecoratedLinkExpression).

♻️ Example helper extraction

/// Helper to convert SurrealDB query results to DecoratedLinkExpression
async fn process_link_results(&self, results: surrealdb::Response) -> Result<Vec<DecoratedLinkExpression>, Error> {
    let mut response = results;
    let result: SurrealValue = response.take(0)?;
    
    let json_string = serde_json::to_string(&result)?;
    let json_value: Value = serde_json::from_str(&json_string)?;
    let unwrapped = unwrap_surreal_json(json_value);
    
    if let Value::Array(arr) = unwrapped {
        let links: Vec<DecoratedLinkExpression> = arr
            .into_iter()
            .filter_map(|value| serde_json::from_value::<SurrealLink>(value).ok())
            .map(|surreal_link| surreal_link.into())
            .collect();
        return Ok(links);
    }
    
    Ok(vec![])
}

Then each get_links_by_* method becomes:

pub async fn get_links_by_source(&self, _perspective_uuid: &str, source: &str) -> Result<Vec<DecoratedLinkExpression>, Error> {
    let results = self.db.query("SELECT * FROM link WHERE source = $source")
        .bind(("source", source.to_string()))
        .await?;
    self.process_link_results(results).await
}

coderabbitai · 2026-01-29T13:25:48Z

rust-executor/src/surreal_service/mod.rs

+    /// Get a specific link by its unique constraint fields
+    ///
+    /// # Arguments
+    /// * `_perspective_uuid` - UUID of the perspective (unused, for API consistency)
+    /// * `source` - Source address of the link
+    /// * `predicate` - Optional predicate of the link (defaults to empty string if None)
+    /// * `target` - Target address of the link
+    /// * `author` - Optional author DID to match (if None, only matches source/predicate/target)
+    /// * `timestamp` - Optional timestamp to match (if None, only matches source/predicate/target)
+    ///
+    /// # Returns
+    /// * `Ok(Some(DecoratedLinkExpression))` - The matching link if found
+    /// * `Ok(None)` - No matching link found
+    /// * `Err(Error)` - Database query error
+    pub async fn get_link(
+        &self,
+        _perspective_uuid: &str,
+        source: &str,
+        predicate: Option<&str>,
+        target: &str,
+        author: &str,
+        timestamp: &str,
+    ) -> Result<Option<DecoratedLinkExpression>, Error> {


⚠️ Potential issue | 🟡 Minor

Fix stale documentation for get_link parameters.

The doc comments describe author and timestamp as optional, but the actual function signature requires them as &str. This documentation is outdated from when these parameters were Option<&str>.

📝 Proposed fix

- /// * `author` - Optional author DID to match (if None, only matches source/predicate/target) - /// * `timestamp` - Optional timestamp to match (if None, only matches source/predicate/target) + /// * `author` - Author DID of the link + /// * `timestamp` - Timestamp of the link

🤖 Prompt for AI Agents

In `@rust-executor/src/surreal_service/mod.rs` around lines 782 - 804, The docs for get_link are stale: they claim author and timestamp are optional but the signature accepts &str; update the comment for the get_link method to remove/replace wording about optional author/timestamp and the behavior that they are matched only when present, and instead clearly state that author: &str and timestamp: &str are required string parameters that must be provided and are used in the query; reference the get_link function and its parameters (source, predicate, target, author, timestamp) when editing the docblock.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

rust-executor/src/perspectives/perspective_instance.rs (1)

4207-4227: Persist to SurrealDB before updating Prolog/pubsub to keep source-of-truth consistent.

If persistence fails, Prolog and subscribers may observe changes that never reach storage. Persist first, then update derived systems.

🔧 Suggested ordering

-            // Update both Prolog engines: subscription (immediate) + query (lazy)
-            self.update_prolog_engines(combined_diff.clone()).await;
-
-            self.persist_link_diff(&combined_diff)
-                .await
-                .expect("SurrealDB link persistence failed - critical synchronization error");
+            // Persist first (source of truth), then update derived systems
+            self.persist_link_diff(&combined_diff).await?;
+            self.update_prolog_engines(combined_diff.clone()).await;

🤖 Fix all issues with AI agents

In `@rust-executor/src/perspectives/mod.rs`:
- Around line 153-158: Replace the panic-causing .expect() on
SurrealDBService::new with propagated error handling to match the function's
Result return style: call crate::surreal_service::SurrealDBService::new("ad4m",
&handle.uuid, data_path.as_deref()).await and map any error into a String (e.g.
.map_err(|e| format!("Failed to create SurrealDB service for perspective: {}",
e))?) so the error is returned instead of panicking; this mirrors how the
subsequent migration call propagates errors.

rust-executor/src/perspectives/mod.rs

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@rust-executor/src/perspectives/mod.rs`:
- Around line 84-106: Replace the panicking .expect() on SurrealDBService::new
with graceful error handling consistent with the migration block: call
SurrealDBService::new(...).await and if it returns Err, log an error including
the perspective id (handle_clone.uuid) and the error, then return early from the
spawned task (same behavior as
migration::migrate_links_from_rusqlite_to_surrealdb handling); mirror the
logging style used in the migration block so failures are surfaced instead of
panicking (see add_perspective() for propagation style).

rust-executor/src/perspectives/mod.rs

Complements commit e57b8f8 which fixed the same issue in add_perspective(). This fix addresses the spawned task in initialize_from_db() which also had a panicking .expect() call. Changes: - Replace .expect() with match expression - Log error and return early on failure - Prevents panic if SurrealDB creation fails (RocksDB lock, disk, permissions) Addresses CodeRabbit feedback on PR #652 (line 90 issue)

cursor · 2026-02-11T21:34:25Z

You have run out of free Bugbot PR reviews for this billing cycle. This will reset on March 4.

To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

rust-executor/src/perspectives/perspective_instance.rs (1)
4190-4233: ⚠️ Potential issue | 🟠 Major

Persist before Prolog/pubsub to avoid partial updates.

Right now Prolog updates fire before the DB write; if persistence fails, subscribers can observe state that isn’t durable. Persist first, then update Prolog/pubsub.
🛠️ Proposed fix
-            self.update_prolog_engines(combined_diff.clone()).await;
-
-            self.persist_link_diff(&combined_diff)
-                .await
-                .expect("SurrealDB link persistence failed - critical synchronization error");
+            self.persist_link_diff(&combined_diff).await?;
+            self.update_prolog_engines(combined_diff.clone()).await;

🤖 Fix all issues with AI agents

In `@rust-executor/Cargo.toml`:
- Line 137: The Cargo.toml dependency enables SurrealDB with the kv-rocksdb
feature which requires a C/C++11-capable compiler and LLVM/clang for bindgen;
update CI and developer docs to ensure those toolchains are installed on runners
(or switch CI to use the kv-mem feature), and add guidance to set LIBCLANG_PATH
or LLVM_CONFIG_PATH when autodetection fails; reference the surrealdb dependency
line and the kv-rocksdb feature in Cargo.toml and ensure CI job images/install
steps include a C/C++ compiler and LLVM toolchain so cargo build/bindgen
succeed.

In `@rust-executor/src/perspectives/perspective_instance.rs`:
- Around line 912-949: The code currently calls
self.persist_link_diff(...).await.expect(...) which panics on SurrealDB write
failures; replace the expect() with proper error propagation (use the ? operator
or map_err to convert to the function's Result error type) so the caller can
handle/retry, updating the enclosing async function's signature/return type if
needed; apply the same change to all other calls to persist_link_diff in
PerspectiveInstance (and related methods) so failures are propagated instead of
panicking (see methods using persist_link_diff, update_prolog_engines,
pubsub_publish_diff, and spawn_commit_and_handle_error in
perspective_instance.rs).

In `@rust-executor/src/surreal_service/mod.rs`:
- Around line 773-777: In get_all_links (and similarly in get_link and all
get_links_by_* methods) change the silent filter_map(|value|
serde_json::from_value::<SurrealLink>(value).ok()) behavior so deserialization
failures are not dropped silently: attempt serde_json::from_value on each
result, if Ok(converted) push converted.into() into links, and if Err(e) emit a
warning-level log that includes the error (e) and the offending value (or an
identifying id) so schema drift/corruption is visible; keep collecting
successful conversions into the final Vec<DecoratedLinkExpression>. Ensure you
reference the SurrealLink type, the results iterator, and the
DecoratedLinkExpression conversion logic when making this change so it is
applied consistently across get_all_links, get_link, and the get_links_by_*
methods.

🧹 Nitpick comments (3)

rust-executor/src/surreal_service/mod.rs (3)
584-588: remove_link creates orphan nodes as a side effect via ensure_node.

ensure_node creates a node record when one doesn't exist. Calling it during link removal means that removing a link whose source/target nodes were already cleaned up (or never existed) will create new orphan node records. Since the node ID is a deterministic SHA-256 hash of the URI, you can compute the ID directly without the create-if-absent behaviour.
Proposed helper to compute node ID without side effects
+    /// Compute the deterministic node ID for a URI without creating it in the DB.
+    fn node_id_for_uri(uri: &str) -> String {
+        use sha2::{Digest, Sha256};
+        let mut hasher = Sha256::new();
+        hasher.update(uri.as_bytes());
+        let hash = format!("{:x}", hasher.finalize());
+        format!("node:{}", &hash[..16])
+    }
+
     pub async fn remove_link(
         &self,
         _perspective_uuid: &str,
         link: &DecoratedLinkExpression,
     ) -> Result<(), Error> {
-        let source_id = self.ensure_node(&link.data.source).await?;
-        let target_id = self.ensure_node(&link.data.target).await?;
+        let source_id = Self::node_id_for_uri(&link.data.source);
+        let target_id = Self::node_id_for_uri(&link.data.target);
This also avoids unnecessary write retries and DB round-trips on a removal path.

765-780: get_all_links routes through query_links (with timeout/logging overhead) while the other get_links_by_* methods query db directly.

This is an inconsistency — get_all_links pays the cost of a 60-second timeout wrapper, logging-task spawn, and readonly validation for a hardcoded SELECT *, while the sibling methods (get_links_by_source, etc.) skip all of that. Consider aligning the approach: either all helpers go direct, or all go through query_links.

847-951: Significant duplication across get_links_by_source, get_links_by_target, and get_links_by_predicate.

These three methods (and get_link) share identical boilerplate: db.query → response.take(0) → JSON round-trip → unwrap_surreal_json → filter_map deserialize. Consider extracting a private helper like:
async fn query_as_links(&self, query: &str, bindings: Vec<(&str, String)>) -> Result<Vec<DecoratedLinkExpression>, Error> { ... }
This would reduce ~100 lines of duplicated conversion logic to single call sites.

coderabbitai · 2026-02-11T21:41:54Z

rust-executor/Cargo.toml

 deno_error = "0.5.6"
 thiserror = "2.0.12"
-surrealdb = { version = "2.4", default-features = false, features = ["kv-mem", "scripting"] }
+surrealdb = { version = "2.4", default-features = false, features = ["kv-rocksdb", "kv-mem", "scripting"] }


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

SurrealDB 2.4 kv-rocksdb feature RocksDB build requirements

💡 Result:

For SurrealDB 2.4.x, enabling the RocksDB storage engine is done via the Cargo feature kv-rocksdb (e.g. cargo build --features surrealdb/kv-rocksdb). [1]

What you need installed to build with kv-rocksdb

Because this feature compiles native RocksDB bindings, you generally need:

A C/C++ build toolchain (compiler + linker; RocksDB is C++). RocksDB requires a C++11-capable compiler (e.g., GCC/Clang). [5]

Clang/LLVM (libclang) available for bindgen (the RocksDB Rust bindings require this). [3][4]

On macOS/Linux you may need to set LIBCLANG_PATH / LLVM_CONFIG_PATH if autodetection fails (common when compiling librocksdb-sys). [6]

(Platform packaging hint) Some distributions/ports express this as needing llvm-config at build time and linking against librocksdb (and often zstd). [2]

Optional / common native deps

RocksDB can be built with optional compression libraries such as snappy, lz4, zstd, zlib, bzip2. Whether they’re needed depends on how the Rust RocksDB wrapper is configured/features enabled, but they’re common native deps in RocksDB builds. [5]

Notes

If you enable RocksDB LTO in the RocksDB wrapper, it specifically requires building with clang and matching LLVM versions/flags. [3]

Sources: SurrealDB blog/docs and crate docs for enabling kv-rocksdb and bindgen/libclang requirements, plus RocksDB install notes and packaging metadata. [1][2][3][4][5][6]

🌐 Web query:

SurrealDB cargo features kv-rocksdb kv-mem scripting compatibility 2.4

💡 Result:

For SurrealDB Rust crates in the 2.4.x line, the Cargo feature flags you mentioned are present and compatible:

kv-mem: in-memory storage engine (and it’s the default for surrealdb-core 2.4.1). (docs.rs)

kv-rocksdb: RocksDB-backed storage engine. (Enabling embedded engines requires turning on the corresponding kv-* feature; RocksDB also brings native build deps.) (docs.rs)

scripting: enables SurrealDB’s scripting support (wired to a js dependency in surrealdb-core). (docs.rs)

In the top-level surrealdb crate, these features map through to surrealdb-core (e.g., scripting = ["surrealdb-core/scripting"], kv-rocksdb = ["surrealdb-core/kv-rocksdb", ...]). (docs.rs)

Example Cargo.toml (2.4.x) enabling all three:

[dependencies] surrealdb = { version = "2.4", default-features = false, features = ["kv-mem", "kv-rocksdb", "scripting"] }

Citations:

1: https://docs.rs/crate/surrealdb-core/2.4.1/source/Cargo.toml.orig?utm_source=openai

2: https://docs.rs/crate/surrealdb-core/2.4.1/source/Cargo.toml.orig?utm_source=openai

3: https://docs.rs/crate/surrealdb-core/2.4.1/source/Cargo.toml.orig?utm_source=openai

4: https://docs.rs/crate/surrealdb/latest/source/Cargo.toml.orig?utm_source=openai

Ensure CI has C/C++ compiler and LLVM/clang for building with kv-rocksdb.

The feature combination is supported in SurrealDB 2.4, but enabling kv-rocksdb requires native build dependencies: a C++11-capable C/C++ compiler and LLVM/clang (for bindgen to compile RocksDB bindings). Verify that CI runners and local build environments have these toolchains available; you may also need to set LIBCLANG_PATH or LLVM_CONFIG_PATH if autodetection fails.

🤖 Prompt for AI Agents

In `@rust-executor/Cargo.toml` at line 137, The Cargo.toml dependency enables SurrealDB with the kv-rocksdb feature which requires a C/C++11-capable compiler and LLVM/clang for bindgen; update CI and developer docs to ensure those toolchains are installed on runners (or switch CI to use the kv-mem feature), and add guidance to set LIBCLANG_PATH or LLVM_CONFIG_PATH when autodetection fails; reference the surrealdb dependency line and the kv-rocksdb feature in Cargo.toml and ensure CI job images/install steps include a C/C++ compiler and LLVM toolchain so cargo build/bindgen succeed.

coderabbitai · 2026-02-11T21:41:54Z

rust-executor/src/perspectives/perspective_instance.rs

+            // Query SurrealDB instead of Rusqlite
+            if let Some(decorated_link) = self
+                .surreal_service
+                .get_link(
+                    &handle.uuid,
+                    &link_expression.data.source,
+                    link_expression.data.predicate.as_deref(),
+                    &link_expression.data.target,
+                    &link_expression.author,
+                    &link_expression.timestamp,
+                )
+                .await?
            {
-                Ad4mDb::with_global_instance(|db| db.remove_link(&handle.uuid, &link_expression))?;
+                let link_from_db = LinkExpression::from(decorated_link.clone());
+                let status = decorated_link.status.clone().unwrap_or(LinkStatus::Local);
+
                let diff = PerspectiveDiff::from_removals(vec![link_expression.clone()]);
-                let decorated_link = DecoratedLinkExpression::from((link_from_db, status.clone()));
+                let decorated_link_result =
+                    DecoratedLinkExpression::from((link_from_db, status.clone()));
                let decorated_diff =
-                    DecoratedPerspectiveDiff::from_removals(vec![decorated_link.clone()]);
+                    DecoratedPerspectiveDiff::from_removals(vec![decorated_link_result.clone()]);
+
+                // Remove from SurrealDB (primary storage)
+                self.persist_link_diff(&decorated_diff)
+                    .await
+                    .expect("SurrealDB link persistence failed - critical synchronization error");

                // Update both Prolog engines: subscription (immediate) + query (lazy)
                self.update_prolog_engines(decorated_diff.clone()).await;

-                self.update_surreal_cache(&decorated_diff).await;
                self.pubsub_publish_diff(decorated_diff.clone()).await;

                if status == LinkStatus::Shared {
                    self.spawn_commit_and_handle_error(&diff);
                }

-                Ok(decorated_link)
+                Ok(decorated_link_result)
            } else {


⚠️ Potential issue | 🟠 Major

Avoid panicking on SurrealDB write failure.

This path is user-facing; expect() will crash the executor on transient IO/lock errors. Prefer propagating the error so callers can handle/retry, and apply the same change in other persist_link_diff call sites.

🛠️ Proposed fix

- self.persist_link_diff(&decorated_diff) - .await - .expect("SurrealDB link persistence failed - critical synchronization error"); + self.persist_link_diff(&decorated_diff).await?;

🤖 Prompt for AI Agents

In `@rust-executor/src/perspectives/perspective_instance.rs` around lines 912 - 949, The code currently calls self.persist_link_diff(...).await.expect(...) which panics on SurrealDB write failures; replace the expect() with proper error propagation (use the ? operator or map_err to convert to the function's Result error type) so the caller can handle/retry, updating the enclosing async function's signature/return type if needed; apply the same change to all other calls to persist_link_diff in PerspectiveInstance (and related methods) so failures are propagated instead of panicking (see methods using persist_link_diff, update_prolog_engines, pubsub_publish_diff, and spawn_commit_and_handle_error in perspective_instance.rs).

coderabbitai · 2026-02-11T21:41:54Z

rust-executor/src/surreal_service/mod.rs

+        let links: Vec<DecoratedLinkExpression> = results
+            .into_iter()
+            .filter_map(|value| serde_json::from_value::<SurrealLink>(value).ok())
+            .map(|surreal_link| surreal_link.into())
+            .collect();


⚠️ Potential issue | 🟡 Minor

Silent deserialization failures are dropped via filter_map(|v| ... .ok()).

This pattern appears in get_all_links (line 775), get_link (line 829), and all get_links_by_* methods. If a stored row can't be deserialized into SurrealLink (e.g., schema drift, corrupted data from migration), it's silently excluded from results with no log or error. Consider logging a warning on deserialization failure so data issues don't go unnoticed.

Example fix for get_all_links (apply similarly elsewhere)

let links: Vec<DecoratedLinkExpression> = results .into_iter() - .filter_map(|value| serde_json::from_value::<SurrealLink>(value).ok()) + .filter_map(|value| { + match serde_json::from_value::<SurrealLink>(value.clone()) { + Ok(link) => Some(link), + Err(e) => { + log::warn!("Failed to deserialize SurrealLink: {:?} from value: {:?}", e, value); + None + } + } + }) .map(|surreal_link| surreal_link.into()) .collect();

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

let links: Vec<DecoratedLinkExpression> = results

.into_iter()

.filter_map(|value| serde_json::from_value::<SurrealLink>(value).ok())

.map(|surreal_link| surreal_link.into())

.collect();

let links: Vec<DecoratedLinkExpression> = results

.into_iter()

.filter_map(|value| {

match serde_json::from_value::<SurrealLink>(value.clone()) {

Ok(link) => Some(link),

Err(e) => {

log::warn!("Failed to deserialize SurrealLink: {:?} from value: {:?}", e, value);

None

}

}

})

.map(|surreal_link| surreal_link.into())

.collect();

🤖 Prompt for AI Agents

In `@rust-executor/src/surreal_service/mod.rs` around lines 773 - 777, In get_all_links (and similarly in get_link and all get_links_by_* methods) change the silent filter_map(|value| serde_json::from_value::<SurrealLink>(value).ok()) behavior so deserialization failures are not dropped silently: attempt serde_json::from_value on each result, if Ok(converted) push converted.into() into links, and if Err(e) emit a warning-level log that includes the error (e) and the offending value (or an identifying id) so schema drift/corruption is visible; keep collecting successful conversions into the final Vec<DecoratedLinkExpression>. Ensure you reference the SurrealLink type, the results iterator, and the DecoratedLinkExpression conversion logic when making this change so it is applied consistently across get_all_links, get_link, and the get_links_by_* methods.

lucksus added 6 commits January 27, 2026 21:51

Surreal files per perspective wip 1

e376c1d

Avoid duplicate links w/ unique index and handle lock and write errors

35275ce

Fix new remove_link on surreal service

6d90f2f

Rename update_surreal_cache() to persist_link_diff()

053c5dc

Temporary perspective data migration from rusqlite to surreal

19abd3b

fmt

48d1ef2

coderabbitai bot reviewed Jan 28, 2026

View reviewed changes

data-coasys added 4 commits January 29, 2026 00:38

coderabbitai bot reviewed Jan 28, 2026

View reviewed changes

rust-executor/src/surreal_service/mod.rs Show resolved Hide resolved

data-coasys added 4 commits January 29, 2026 00:51

fix: clone link.status to avoid partial move

c79a924

Compilation error: link.status.unwrap_or() moves the value, preventing use of 'link' afterwards. Use clone() to avoid the partial move.

fix: borrow links in migration loop to avoid partial move

640b6e6

Changed 'for (link_expr, status) in links' to '&links' and cloned status to avoid moving values out of the vector.

chore: run cargo fmt and add PR fixes summary

5b76446

coderabbitai bot reviewed Jan 29, 2026

View reviewed changes

rust-executor/src/perspectives/perspective_instance.rs Show resolved Hide resolved

coderabbitai bot reviewed Jan 29, 2026

View reviewed changes

refactor: remove redundant variable reassignments in get_link

cbc752d

Addresses CodeRabbit feedback: simplified variable flow by directly assigning query result to response instead of going through intermediate query and results variables. Co-authored-by: CodeRabbit AI <coderabbit@example.com>

coderabbitai bot reviewed Jan 29, 2026

View reviewed changes

rust-executor/src/surreal_service/mod.rs Outdated Show resolved Hide resolved

rust-executor/src/surreal_service/mod.rs Show resolved Hide resolved

coderabbitai bot reviewed Jan 29, 2026

View reviewed changes

lucksus added 3 commits January 29, 2026 15:17

reset bootstrapSeed.json

fa4382f

Handle fallback sync read failure gracefully.

ee1143d

Don’t proceed when migration fails.

9e8e5b4

coderabbitai bot reviewed Jan 29, 2026

View reviewed changes

rust-executor/src/perspectives/mod.rs Outdated Show resolved Hide resolved

Fix inconsistent error handling: .expect() vs. map_err()?

e57b8f8

coderabbitai bot reviewed Jan 29, 2026

View reviewed changes

rust-executor/src/perspectives/mod.rs Outdated Show resolved Hide resolved

Merge branch 'dev' into surreal-files

6b34334

coderabbitai bot reviewed Feb 11, 2026

View reviewed changes

lucksus added 4 commits February 12, 2026 21:58

Merge branch 'dev' into surreal-files

0da45d7

don't fail silently if links from DB can't be parsed

3931d7c

don't panic on DB write failures but log error

2e0aef8

Merge branch 'dev' into surreal-files

942e2d4

data-coasys merged commit 99102a7 into dev Feb 17, 2026
5 checks passed

data-coasys mentioned this pull request Feb 17, 2026

P-Diff Fixes + Link Deserialization Fix (Deadlock Fixes Clean) #672

Closed

This was referenced Feb 18, 2026

fix: reject links with invalid or empty source/target/predicate URIs #676

Merged

feat: WASM-based language execution runtime #692

Closed

This was referenced Feb 26, 2026

Ad4mModel Refactor #694

Open

fix: Memory leak investigation and perspective teardown fixes #689

Open

feat: WASM-based language execution runtime #707

Draft

fix: handle language runtime shutdown gracefully to prevent thread panics #710

Closed

coderabbitai bot mentioned this pull request Mar 21, 2026

Replace SurrealDB with SPARQL (Oxigraph) as the sole query/storage backend #760

Open

Conversation

lucksus commented Jan 28, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Key Changes

Storage Layer (src/surreal_service/mod.rs)

Perspective Management (src/perspectives/mod.rs, perspective_instance.rs)

Migration System (src/perspectives/migration.rs ⭐ NEW)

Database Support (src/db.rs)

On first startup after upgrade:

DB File Structure

Future Cleanup

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 29, 2026

Choose a reason for hiding this comment

What usually goes wrong

Recommended singleton patterns (thread-safe)

1) std::sync::OnceLock (manual/lazy one-time init)

2) std::sync::Once (run an init function exactly once)

If tests still interfere

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cursor bot commented Feb 11, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 11, 2026

Choose a reason for hiding this comment

lucksus commented Jan 28, 2026 •

edited by coderabbitai bot

Loading

Storage Layer (`src/surreal_service/mod.rs`)

Perspective Management (`src/perspectives/mod.rs`, `perspective_instance.rs`)

Migration System (`src/perspectives/migration.rs` ⭐ NEW)

Database Support (`src/db.rs`)

coderabbitai bot commented Jan 28, 2026 •

edited

Loading

1) `std::sync::OnceLock` (manual/lazy one-time init)

2) `std::sync::Once` (run an init function exactly once)

What you need installed to build with `kv-rocksdb`