Switch SurrealDB from In-Memory to File-Based RocksDB Storage#652
Switch SurrealDB from In-Memory to File-Based RocksDB Storage#652data-coasys merged 26 commits intodevfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds per‑perspective SurrealDB persistence with optional RocksDB backend, a one‑time idempotent migration from Rusqlite to SurrealDB, migration tracking and cleanup APIs, rewrites perspective read/write paths to use SurrealDB, and enables the Changes
Sequence Diagram(s)sequenceDiagram
participant Perspectives as Perspectives Module
participant Rusqlite as Ad4mDb (Rusqlite)
participant Migration as Migration Module
participant Surreal as SurrealDB Service
participant Rocks as RocksDB
Perspectives->>Surreal: new(namespace, db, data_path)
Surreal->>Rocks: Initialize RocksDB (if data_path)
Perspectives->>Migration: migrate_links_from_rusqlite_to_surrealdb(perspective_uuid)
Migration->>Rusqlite: is_perspective_migrated(perspective_uuid)?
alt already migrated
Migration-->>Perspectives: return (idempotent)
else
Migration->>Rusqlite: load all links(perspective_uuid)
loop per link
Migration->>Migration: convert to DecoratedLinkExpression
Migration->>Surreal: add link -> persist
Surreal->>Rocks: persist edge (if RocksDB)
end
Migration->>Rusqlite: mark_perspective_as_migrated(perspective_uuid)
Migration->>Rusqlite: delete_all_links_for_perspective(perspective_uuid)
Migration-->>Perspectives: return success
end
sequenceDiagram
participant Client
participant PI as PerspectiveInstance
participant Surreal as SurrealDB Service
participant Rocks as RocksDB
Client->>PI: add_link_expression(link)
PI->>PI: build DecoratedPerspectiveDiff
PI->>PI: persist_link_diff(diff)
PI->>Surreal: remove_links(matching old)
Surreal->>Rocks: delete edges
PI->>Surreal: add DecoratedLinkExpressions
Surreal->>Rocks: insert edges
PI->>PI: update Prolog/pubsub
PI-->>Client: return result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
rust-executor/src/perspectives/perspective_instance.rs (1)
2425-2457: Propagate SurrealDB write failures to callers.
persist_link_diff/retry_surreal_opcurrently log warnings and silently continue on non-retriable errors. With SurrealDB as the primary store, this risks desynchronization between persistence and Prolog/pubsub state. Change both to returnResultand propagate failures.Most callsites (
remove_link,add_link_expression,add_links,link_mutations,update_link,commit_pending_diffs) already returnResult<_, AnyError>and can propagate with?. Fordiff_from_link_language, which returns(), use.await.expect()to fail-fast on critical synchronization operations.🛠️ Suggested fix
- async fn retry_surreal_op<F, Fut>(op: F, uuid: &str, op_name: &str) + async fn retry_surreal_op<F, Fut>( + op: F, + uuid: &str, + op_name: &str, + ) -> Result<(), anyhow::Error> where F: Fn() -> Fut, Fut: std::future::Future<Output = Result<(), anyhow::Error>>, { let mut attempts = 0; let max_attempts = 5; loop { match op().await { - Ok(_) => break, + Ok(_) => return Ok(()), Err(e) => { let msg = format!("{}", e); if msg.contains("Failed to commit transaction due to a read or write conflict") && attempts < max_attempts { attempts += 1; tokio::time::sleep(std::time::Duration::from_millis(100 * attempts)).await; continue; } else { log::warn!( "Failed to {} link in SurrealDB for perspective {}: {:?}", op_name, uuid, e ); - break; + return Err(e); } } } } } -pub(crate) async fn persist_link_diff(&self, diff: &DecoratedPerspectiveDiff) { +pub(crate) async fn persist_link_diff( + &self, + diff: &DecoratedPerspectiveDiff, +) -> Result<(), AnyError> { // Get UUID let uuid = { let persisted_guard = self.persisted.lock().await; persisted_guard.uuid.clone() }; @@ for removal in &diff.removals { Self::retry_surreal_op( || self.surreal_service.remove_link(&uuid, removal), &uuid, "remove", ) - .await; + .await?; } // Additions after for addition in &diff.additions { Self::retry_surreal_op( || self.surreal_service.add_link(&uuid, addition), &uuid, "add", ) - .await; + .await?; } + Ok(()) }Then update callsites:
- Functions returning
Result: change.await;to.await?;diff_from_link_language: change.await;to.await.expect("SurrealDB link persistence failed");
🤖 Fix all issues with AI agents
In `@rust-executor/MIGRATION_REMOVAL_GUIDE.md`:
- Around line 112-113: Update the heading text "Can optionally be cleaned up:"
in MIGRATION_REMOVAL_GUIDE.md to read "The following can optionally be cleaned
up:" so it becomes a complete sentence; locate the existing heading string (the
line containing "Can optionally be cleaned up:") and replace it with the
proposed wording, leaving the subsequent list (e.g., "- `src/db.rs` - Migration
tracking methods (optional to remove)") unchanged.
In `@rust-executor/src/perspectives/migration.rs`:
- Around line 97-119: The migration currently marks the perspective as migrated
and proceeds to delete links even when some Link migrations failed (tracked by
error_count), risking data loss; update the logic in the function that runs the
loop (the block using migrated_count and error_count and calling
Ad4mDb::with_global_instance(...).mark_perspective_as_migrated) to check
error_count after the migration loop and if error_count > 0 return an Err (e.g.,
Err(format!(...)) converted to the function's error type) instead of calling
Ad4mDb::with_global_instance(...).mark_perspective_as_migrated; alternatively,
if you prefer partial success, only delete links you recorded as successfully
migrated (use migrated_count and whatever collection tracks successful link IDs)
and only mark the perspective as migrated when error_count == 0, referencing
error_count, migrated_count, and
Ad4mDb::with_global_instance/mark_perspective_as_migrated to locate where to
change the control flow.
In `@rust-executor/src/perspectives/mod.rs`:
- Around line 441-457: The code currently maps instance.links into
DecoratedLinkExpression always using LinkStatus::Local; instead preserve each
link's original status by reading the link's status field
(LinkExpression.status) and passing that through to DecoratedLinkExpression (or
convert Option<LinkStatus> into the expected status enum), falling back to
LinkStatus::Local only when the original status is None; update the mapping in
the instance.links.into_iter().map(...) call (where
DecoratedLinkExpression::from((link, LinkStatus::Local)) is used) so it
constructs the decorated link with the actual link.status instead of hardcoding
LinkStatus::Local before calling perspective.persist_link_diff(&diff).await.
- Around line 56-66: The current read-then-release check on PERSPECTIVES allows
a TOCTOU race where multiple tasks calling initialize_from_db can each pass the
contains_key check and create duplicate SurrealDB services; change this to an
atomic operation by taking a write lock and inserting a placeholder entry (e.g.,
a "pending" marker) for handle_clone.uuid before spawning the async
initialization, or perform an entry-or-init pattern that checks and inserts
under the same write lock so only one initializer proceeds; ensure the eventual
successful initialization replaces the placeholder and failures remove it so
initialize_from_db (and the GraphQL import mutation) cannot start duplicate
initializations.
In `@rust-executor/src/perspectives/perspective_instance.rs`:
- Around line 878-891: The SurrealDB lookup is too broad: update the lookup to
honor the full unique constraint (source, target, predicate, author, timestamp)
instead of only (source, target, predicate). Modify surreal_service::get_link
(or the callsites remove_link, update_link, add_shared_links) so it accepts/uses
author and timestamp from the existing LinkExpression/old_link/link parameters
and queries/matches all five fields (link_unique_idx) or filters fetched results
by author+timestamp before returning; ensure get_link no longer returns an
arbitrary LIMIT 1 row when multiple authors/timestamps exist and that callers
(e.g., the code creating decorated_link/LinkExpression) rely on the corrected
lookup.
In `@rust-executor/src/surreal_service/mod.rs`:
- Around line 532-576: The error handling in the add_link flow currently treats
any error string containing "index" as a benign unique-constraint issue; update
the match branch that inspects error_string (the Err(e) arm where result is
matched) to remove the check for error_string.contains("index") so only checks
for "unique", "duplicate", and "already exists" remain; this ensures genuine
index/schema errors are propagated instead of being swallowed while keeping
idempotent handling for duplicate-link errors.
🧹 Nitpick comments (2)
rust-executor/src/perspectives/mod.rs (1)
80-93: Migration failure logging is appropriate but consider user visibility.Migration failures are logged at error level but the perspective still initializes. This is defensive, but users won't see these errors in a UI. Consider whether a more visible notification mechanism is needed for production.
rust-executor/src/surreal_service/mod.rs (1)
756-904: Consider DRYing the SurrealLink deserialization pathThe JSON unwrap + SurrealLink mapping is repeated across the getters; a small helper would reduce drift.
- MIGRATION_REMOVAL_GUIDE.md: Complete sentence in heading - migration.rs: Only mark as migrated when error_count == 0 (prevents data loss) - surreal_service/mod.rs: Remove overly broad 'index' error check (more precise error handling) Addresses CodeRabbit actionable comments on PR #652
) Instead of hardcoding LinkStatus::Local, now reads link.status and uses it (falls back to Local if None). This preserves the original link status during import operations. Addresses CodeRabbit actionable comment on PR #652
- retry_surreal_op now returns Result and propagates errors - persist_link_diff now returns Result instead of silently swallowing errors - Updated all callsites: - Functions returning Result: use .await? to propagate - Functions returning (): use .await.expect() to fail-fast - Critical synchronization operations now fail loudly instead of silently Addresses CodeRabbit actionable comment #1 on PR #652
Updated get_link to accept and use author and timestamp parameters: - get_link now takes optional author and timestamp - When provided, queries using all 5 unique fields (source, target, predicate, author, timestamp) - When not provided, falls back to 3-field lookup for backward compatibility - Updated all callsites to pass author and timestamp from LinkExpression This prevents returning arbitrary links when multiple authors/timestamps exist for the same source/predicate/target combination. Addresses CodeRabbit actionable comment #6 on PR #652
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@rust-executor/src/surreal_service/mod.rs`:
- Around line 792-829: The get_link implementation only filters by
source/target/predicate but the unique key includes author and timestamp, so
modify the get_link signature to accept author: &str and timestamp: &str (or
appropriate types), update the SQL query string to add AND author = $author AND
timestamp = $timestamp and bind those values (update parameter variables like
predicate_str, source_owned, target_owned to include author_owned and
timestamp_owned), and return the same SurrealLink conversion; then update all
call sites that invoke get_link (the four places in perspective_instance.rs that
pass a link_expression) to supply &link_expression.author and
&link_expression.timestamp.
🧹 Nitpick comments (1)
rust-executor/src/perspectives/mod.rs (1)
303-318: Use async file operations to avoid blocking the executor.
std::fs::remove_dir_allis a blocking call within an async function. This can block the tokio executor thread, potentially starving other tasks. Consider usingtokio::fs::remove_dir_allfor non-blocking directory removal.♻️ Proposed fix to use async file operations
// Clean up RocksDB directory for this perspective if let Some(data_path) = get_app_data_path() { let db_path = std::path::Path::new(&data_path).join(format!("surrealdb_perspectives/{}", uuid)); if db_path.exists() { - if let Err(e) = std::fs::remove_dir_all(&db_path) { + if let Err(e) = tokio::fs::remove_dir_all(&db_path).await { log::warn!( "Failed to remove SurrealDB directory for perspective {}: {}", uuid, e ); } else { log::debug!("Cleaned up SurrealDB directory for perspective {}", uuid); } } }
Added atomic check-and-insert before storing perspective: - Initial read-lock check remains for quick filtering - After async initialization completes, do final write-lock check - Only insert if another task hasn't already initialized this perspective - Discard duplicate work and don't start background tasks if race lost This prevents multiple tasks from creating duplicate SurrealDB services for the same perspective UUID. Addresses CodeRabbit actionable comment #5 on PR #652
Compilation error: link.status.unwrap_or() moves the value, preventing use of 'link' afterwards. Use clone() to avoid the partial move.
Changed 'for (link_expr, status) in links' to '&links' and cloned status to avoid moving values out of the vector.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
rust-executor/src/surreal_service/mod.rs (1)
1655-1656: Test queries reference non-existentperspectivefield.Several tests query
WHERE perspective = $perspective, but thelinktable schema (lines 379-387) doesn't include aperspectivefield. Per the retrieved learnings, each perspective has its own isolated database, so no perspective filtering is needed.These tests will likely fail or return no results:
- Line 1655:
SELECT * FROM link WHERE perspective = $perspective ORDER BY timestamp- Line 1674: graph query with perspective filter
- Line 1800, 1816, 1832, etc.
🐛 Fix the test queries
- let query = "SELECT * FROM link WHERE perspective = $perspective ORDER BY timestamp"; + let query = "SELECT * FROM link ORDER BY timestamp";And similarly for other occurrences in
test_reload_perspective_data_integrityandtest_reload_perspective_with_large_dataset.
🤖 Fix all issues with AI agents
In `@rust-executor/src/perspectives/perspective_instance.rs`:
- Around line 539-548: The code calls
self.surreal_service.get_all_links(&uuid).await.unwrap() inside the
link_language_clone branch which will panic if SurrealDB is unreachable; change
this to handle the Result by matching or using if let Err(e) to log the error
(use the existing logger or processLogger-like facility) and return false so the
fallback sync loop can retry; on success, proceed to map decorated_links into
local_links as before (refer to get_all_links, decorated_links, local_links,
LinkExpression, LinkStatus, and link_language_clone).
🧹 Nitpick comments (8)
rust-executor/src/perspectives/mod.rs (3)
154-157: Consider propagating SurrealDB creation errors instead of panicking.Using
.expect()on line 157 will crash the entire application if SurrealDB service creation fails (e.g., due to disk space, permissions, or RocksDB lock issues). Sinceadd_perspectivealready returnsResult<(), String>, consider propagating this error:♻️ Proposed fix to propagate error
let surreal_service = - crate::surreal_service::SurrealDBService::new("ad4m", &handle.uuid, data_path.as_deref()) + crate::surreal_service::SurrealDBService::new("ad4m", &handle.uuid, data_path.as_deref()) .await - .expect("Failed to create SurrealDB service for perspective"); + .map_err(|e| format!("Failed to create SurrealDB service: {}", e))?;
315-330: Blocking filesystem operation in async context.
std::fs::remove_dir_allis a synchronous operation that could block the async executor, especially if the directory contains many files or is on a slow filesystem. Consider usingtokio::fs::remove_dir_allfor non-blocking cleanup:♻️ Proposed fix using async filesystem operations
// Clean up RocksDB directory for this perspective if let Some(data_path) = get_app_data_path() { let db_path = std::path::Path::new(&data_path).join(format!("surrealdb_perspectives/{}", uuid)); if db_path.exists() { - if let Err(e) = std::fs::remove_dir_all(&db_path) { + if let Err(e) = tokio::fs::remove_dir_all(&db_path).await { log::warn!( "Failed to remove SurrealDB directory for perspective {}: {}", uuid, e ); } else { log::debug!("Cleaned up SurrealDB directory for perspective {}", uuid); } } }
84-90: Inconsistent error handling: SurrealDB creation panics vs migration logs.At line 90, SurrealDB creation failure causes a panic (aborting the spawned task), while migration failures at lines 100-105 are logged and execution continues. This inconsistency means:
- A RocksDB lock failure → panic, perspective silently doesn't initialize
- A migration failure → logged, perspective initializes anyway
Consider using consistent error handling:
♻️ Proposed fix for consistent error handling
- let surreal_service = crate::surreal_service::SurrealDBService::new( + let surreal_service = match crate::surreal_service::SurrealDBService::new( "ad4m", &handle_clone.uuid, data_path.as_deref(), ) .await - .expect("Failed to create SurrealDB service for perspective"); + { + Ok(service) => service, + Err(e) => { + log::error!( + "Failed to create SurrealDB service for perspective {}: {}", + handle_clone.uuid, + e + ); + return; + } + };rust-executor/src/surreal_service/mod.rs (5)
578-609: Consider avoiding node creation on remove.The
remove_linkmethod callsensure_nodefor both source and target, which will create nodes if they don't exist. While harmless, this is slightly wasteful when deleting a non-existent link.You could compute the node IDs without creating them:
♻️ Optional optimization
pub async fn remove_link( &self, _perspective_uuid: &str, link: &DecoratedLinkExpression, ) -> Result<(), Error> { - // Get node IDs for source and target - let source_id = self.ensure_node(&link.data.source).await?; - let target_id = self.ensure_node(&link.data.target).await?; + // Compute node IDs without creating nodes + use sha2::{Digest, Sha256}; + + let mut hasher = Sha256::new(); + hasher.update(link.data.source.as_bytes()); + let source_hash = format!("{:x}", hasher.finalize()); + let source_id = format!("node:{}", &source_hash[..16]); + + let mut hasher = Sha256::new(); + hasher.update(link.data.target.as_bytes()); + let target_hash = format!("{:x}", hasher.finalize()); + let target_id = format!("node:{}", &target_hash[..16]);
807-836: Confusing variable naming and redundant assignments.The variable
queryis assigned aResponse(not a query string), then reassigned throughresultsandresponse. This obscures the code flow.♻️ Suggested simplification
let query = if let (Some(author_str), Some(timestamp_str)) = (author, timestamp) { // Full unique constraint lookup (all 5 fields) let author_owned = author_str.to_string(); let timestamp_owned = timestamp_str.to_string(); - let results = self + self .db .query("SELECT * FROM link WHERE source = $source AND target = $target AND predicate = $predicate AND author = $author AND timestamp = $timestamp LIMIT 1") .bind(("source", source_owned)) .bind(("target", target_owned)) .bind(("predicate", predicate_str)) .bind(("author", author_owned)) .bind(("timestamp", timestamp_owned)) - .await?; - results + .await? } else { // Backward compatible lookup (only source/predicate/target) - let results = self + self .db .query("SELECT * FROM link WHERE source = $source AND target = $target AND predicate = $predicate LIMIT 1") .bind(("source", source_owned)) .bind(("target", target_owned)) .bind(("predicate", predicate_str)) - .await?; - results + .await? }; - let results = query; - - let mut response = results; + let mut response = query; let result: SurrealValue = response.take(0)?;
863-967: Consider extracting common query result processing.The three methods
get_links_by_source,get_links_by_target, andget_links_by_predicateshare identical result processing logic (lines 872-888, 909-925, 950-966). This could be extracted into a helper method.♻️ Optional refactor to reduce duplication
/// Helper to process query results into DecoratedLinkExpression vec async fn process_link_query_results( &self, mut response: surrealdb::Response, ) -> Result<Vec<DecoratedLinkExpression>, Error> { let result: SurrealValue = response.take(0)?; let json_string = serde_json::to_string(&result)?; let json_value: Value = serde_json::from_str(&json_string)?; let unwrapped = unwrap_surreal_json(json_value); if let Value::Array(arr) = unwrapped { let links: Vec<DecoratedLinkExpression> = arr .into_iter() .filter_map(|value| serde_json::from_value::<SurrealLink>(value).ok()) .map(|surreal_link| surreal_link.into()) .collect(); return Ok(links); } Ok(vec![]) }
770-775: Silent deserialization failures may hide data issues.The
filter_mapon line 773 silently drops any rows that fail to deserialize asSurrealLink. While this provides resilience, it could mask data corruption or schema mismatches.Consider logging a warning when deserialization fails:
🛡️ Optional defensive logging
let links: Vec<DecoratedLinkExpression> = results .into_iter() - .filter_map(|value| serde_json::from_value::<SurrealLink>(value).ok()) + .filter_map(|value| { + match serde_json::from_value::<SurrealLink>(value.clone()) { + Ok(link) => Some(link), + Err(e) => { + log::warn!("Failed to deserialize link from SurrealDB: {:?}, value: {:?}", e, value); + None + } + } + }) .map(|surreal_link| surreal_link.into()) .collect();
2138-2174: Add integration test for RocksDB transaction conflict handling.The
test_concurrent_operationstest uses in-memory storage (data_path: None), so it doesn't exercise the RocksDB transaction conflict retry logic inensure_node(which implements exponential backoff for transaction conflicts). Consider adding an integration test with a temporary directory to verify RocksDB concurrent write behavior is handled correctly.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
rust-executor/src/surreal_service/mod.rs (1)
1655-1657: Test queries reference non-existentperspectivefield.The query at line 1655 uses
WHERE perspective = $perspective, but the schema (lines 379-387) does not define aperspectivefield. Each perspective has its own isolated database, so no perspective field is needed or exists. These queries will either error or return empty results.This affects multiple tests:
- Lines 1655, 1674:
test_reload_perspective_data_integrity- Lines 1799-1870: graph traversal queries in same test
- Lines 1902, 1984-2027:
test_reload_perspective_with_large_dataset🔧 Suggested fix for test queries
Remove the
perspective = $perspectivefilter since each perspective has its own isolated database:- let query = "SELECT * FROM link WHERE perspective = $perspective ORDER BY timestamp"; + let query = "SELECT * FROM link ORDER BY timestamp";Apply similar fixes to all affected queries in these tests.
🤖 Fix all issues with AI agents
In `@rust-executor/src/perspectives/migration.rs`:
- Around line 153-188: The tests mutate the global Ad4mDb singleton via
Ad4mDb::init_global_instance() causing race conditions; change setup() to
acquire and return a serialization guard (e.g., lock a static Mutex or use
once_cell::sync::Lazy<Mutex<()>> and return the MutexGuard) and then in
test_migration_tracking() call let _guard = setup(); at the start (and apply the
same pattern to other tests that touch Ad4mDb) so the global singleton is
accessed serially across tests while holding the guard for the test duration.
In `@rust-executor/src/perspectives/mod.rs`:
- Around line 92-105: The migration error is only logged but the perspective
initialization continues; change the flow in the block calling
migration::migrate_links_from_rusqlite_to_surrealdb so that if it returns Err
you stop initialization and propagate the error (e.g., return Err) instead of
just logging—use the same context (handle_clone.uuid, surreal_service) and
ensure the caller receives the failure so the perspective init is skipped when
migration fails.
🧹 Nitpick comments (1)
rust-executor/src/surreal_service/mod.rs (1)
834-836: Minor: Redundant variable reassignments.These lines perform unnecessary reassignments. The
queryvariable is immediately assigned toresults, thenresultstoresponse.♻️ Suggested simplification
- let query = if let (Some(author_str), Some(timestamp_str)) = (author, timestamp) { + let mut response = if let (Some(author_str), Some(timestamp_str)) = (author, timestamp) { // Full unique constraint lookup (all 5 fields) let author_owned = author_str.to_string(); let timestamp_owned = timestamp_str.to_string(); let results = self .db .query("SELECT * FROM link WHERE source = $source AND target = $target AND predicate = $predicate AND author = $author AND timestamp = $timestamp LIMIT 1") .bind(("source", source_owned)) .bind(("target", target_owned)) .bind(("predicate", predicate_str)) .bind(("author", author_owned)) .bind(("timestamp", timestamp_owned)) .await?; results } else { // Backward compatible lookup (only source/predicate/target) let results = self .db .query("SELECT * FROM link WHERE source = $source AND target = $target AND predicate = $predicate LIMIT 1") .bind(("source", source_owned)) .bind(("target", target_owned)) .bind(("predicate", predicate_str)) .await?; results }; - let results = query; - - let mut response = results; let result: SurrealValue = response.take(0)?;
| fn setup() { | ||
| Ad4mDb::init_global_instance(":memory:").unwrap(); | ||
| } | ||
|
|
||
| #[tokio::test] | ||
| async fn test_migration_tracking() { | ||
| setup(); | ||
|
|
||
| let perspective_uuid = "test-migration-uuid"; | ||
|
|
||
| // Test that perspective is not migrated initially | ||
| let is_migrated = Ad4mDb::with_global_instance(|db| { | ||
| db.is_perspective_migrated(perspective_uuid) | ||
| .expect("Failed to check migration status") | ||
| }); | ||
| assert!(!is_migrated, "Perspective should not be migrated initially"); | ||
|
|
||
| // Mark perspective as migrated | ||
| Ad4mDb::with_global_instance(|db| { | ||
| db.mark_perspective_as_migrated(perspective_uuid) | ||
| .expect("Failed to mark perspective as migrated") | ||
| }); | ||
|
|
||
| // Test that perspective is now migrated | ||
| let is_migrated = Ad4mDb::with_global_instance(|db| { | ||
| db.is_perspective_migrated(perspective_uuid) | ||
| .expect("Failed to check migration status") | ||
| }); | ||
| assert!(is_migrated, "Perspective should be migrated"); | ||
|
|
||
| // Test idempotency - marking again should not fail | ||
| Ad4mDb::with_global_instance(|db| { | ||
| db.mark_perspective_as_migrated(perspective_uuid) | ||
| .expect("Failed to mark perspective as migrated (idempotent)") | ||
| }); | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
cargo test parallel execution default behavior
💡 Result:
By default, cargo test runs your Rust tests in parallel (concurrently) using the Rust test harness (libtest). The default number of test threads is based on the machine’s available hardware concurrency (Rust uses available_parallelism). [1]
You can control this parallelism by passing an argument to the test binary:
cargo test -- --test-threads=1or via the environment variable (noted as deprecated in some docs, but still widely used):
RUST_TEST_THREADS=1 cargo testAlso note: Cargo’s -j/--jobs only affects build parallelism, not how many threads run tests; test threading is controlled by --test-threads. [2]
Sources: Rust test execution options (--test-threads) [1], Cargo cargo test docs on --jobs vs --test-threads [2].
🌐 Web query:
Rust global singleton tests thread safety race conditions
💡 Result:
Rust tests run in parallel by default, so any global singleton / shared global state can cause data races or flaky tests unless it’s explicitly thread-safe (or you force tests to run serially). The Rust Book warns that parallel tests must not depend on shared state, and shows running with one thread via cargo test -- --test-threads=1. [1]
What usually goes wrong
- Using
static mut+ ad-hoc “init flags” is race-prone (two threads can observe “not initialized” and both initialize). A classic fix is to usestd::sync::Oncefor one-time init. [5] - Trying to put
std::cell::OnceCell<T>in astaticfails because astaticmust beSync, andOnceCellis not thread-safe. [2][4]
Recommended singleton patterns (thread-safe)
1) std::sync::OnceLock (manual/lazy one-time init)
OnceLock<T> is the thread-safe equivalent intended for statics. [3]
Typical pattern:
static GLOBAL: OnceLock<T> = OnceLock::new();- In code/tests:
GLOBAL.get_or_init(|| make_t())
This prevents double-initialization races across parallel tests. [3]
2) std::sync::Once (run an init function exactly once)
If you need “call init() once” semantics (often in test setups), Once is the standard tool for that. [5]
If tests still interfere
- Either make the shared global state fully thread-safe and isolated/resettable, or
- Run tests serially:
cargo test -- --test-threads=1(slower, but avoids parallel interference). [1]
Sources: Rust Book on parallel tests and --test-threads. [1] Rust docs for OnceLock (thread-safe, usable in statics) and OnceCell (non-thread-safe; points to OnceLock). [3][4] Rust forum explanation that static requires Sync and OnceCell isn’t. [2] Stack Overflow example recommending Once to avoid duplicate init races in tests. [5]
Serialize tests that touch the global Ad4mDb singleton.
cargo test runs tests in parallel by default, and Ad4mDb::init_global_instance() mutates a global singleton, causing inter-test races and flaky tests. Serialize access using a mutex guard.
🧪 Suggested guard for test serialization
mod tests {
use super::*;
use crate::graphql::graphql_types::LinkStatus;
use crate::types::{ExpressionProof, Link, LinkExpression};
use chrono::Utc;
+ use lazy_static::lazy_static;
+ use std::sync::{Mutex, MutexGuard};
- fn setup() {
- Ad4mDb::init_global_instance(":memory:").unwrap();
- }
+ lazy_static! {
+ static ref TEST_MUTEX: Mutex<()> = Mutex::new(());
+ }
+
+ fn setup() -> MutexGuard<'static, ()> {
+ let guard = TEST_MUTEX.lock().unwrap();
+ Ad4mDb::init_global_instance(":memory:").unwrap();
+ guard
+ }
#[tokio::test]
async fn test_migration_tracking() {
- setup();
+ let _guard = setup();Apply let _guard = setup(); to all other tests as well.
🤖 Prompt for AI Agents
In `@rust-executor/src/perspectives/migration.rs` around lines 153 - 188, The
tests mutate the global Ad4mDb singleton via Ad4mDb::init_global_instance()
causing race conditions; change setup() to acquire and return a serialization
guard (e.g., lock a static Mutex or use once_cell::sync::Lazy<Mutex<()>> and
return the MutexGuard) and then in test_migration_tracking() call let _guard =
setup(); at the start (and apply the same pattern to other tests that touch
Ad4mDb) so the global singleton is accessed serially across tests while holding
the guard for the test duration.
Addresses CodeRabbit feedback: simplified variable flow by directly assigning query result to response instead of going through intermediate query and results variables. Co-authored-by: CodeRabbit AI <coderabbit@example.com>
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
rust-executor/src/surreal_service/mod.rs (1)
619-624: Remove embeddedperspective = $perspectivefilters from test queries.Tests at lines 1651, 1670, 1796, 1812, 1828, 1840, 1853, 1898, 1981, 1998, and 2015 embed
WHERE perspective = $perspective, butquery_links()does not bind this variable (the_perspective_uuidparameter is unused). These queries will fail with unbound variable errors at runtime. Update queries to remove perspective field filtering, relying instead on the isolated per-perspective database context, as demonstrated intest_query_without_perspective_binding()at line 2065.
🤖 Fix all issues with AI agents
In `@rust-executor/src/surreal_service/mod.rs`:
- Around line 27-33: The status parsing in the From<SurrealLink> for
DecoratedLinkExpression implementation drops values when SurrealLink.status is
lowercase; update the match in the from(surreal_link: SurrealLink) function to
do a case-insensitive comparison (e.g., convert s to lowercase or use
eq_ignore_ascii_case) so "Shared"/"shared" and "Local"/"local" both map to
LinkStatus::Shared and LinkStatus::Local respectively, preserving migrated data.
- Around line 794-848: get_link currently accepts Option<&str> for author and
timestamp and has a fallback branch that omits them; change get_link to require
author: &str and timestamp: &str (remove the Option types), remove the
conditional/fallback branch and always run the single SELECT that binds source,
target, predicate, author and timestamp (use the existing full-constraint query
string), and eliminate the author/timestamp cloning branch (remove the tuple
match and related else branch). Update all call sites of get_link to pass the
author and timestamp strings and adjust any tests/mocks accordingly so the
UNIQUE index (in, out, predicate, author, timestamp) is always respected.
🧹 Nitpick comments (2)
rust-executor/src/surreal_service/mod.rs (2)
583-606: Avoid creating nodes on delete to prevent orphan growth.
remove_link()callsensure_node(), which will create nodes even when deleting a non-existent link. With persistent RocksDB, this can leave orphan nodes and bloat storage. Consider deleting bysource/targetstring fields instead ofin/outwhen no link exists.♻️ Possible refactor
- let source_id = self.ensure_node(&link.data.source).await?; - let target_id = self.ensure_node(&link.data.target).await?; + let source = link.data.source.clone(); + let target = link.data.target.clone(); self.db .query( "DELETE FROM link WHERE - in = type::thing($source_id) AND - out = type::thing($target_id) AND + source = $source AND + target = $target AND predicate = $predicate AND author = $author AND timestamp = $timestamp", ) - .bind(("source_id", source_id)) - .bind(("target_id", target_id)) + .bind(("source", source)) + .bind(("target", target))
755-963: Deduplicate SurrealDB → DecoratedLinkExpression conversion.
get_all_links,get_links_by_source,get_links_by_target, andget_links_by_predicaterepeat the same unwrap/deserialize logic. Extract a helper (or reusequery_links) to reduce duplication and keep parsing consistent.
1. Remove 'WHERE perspective = $perspective' from test queries - Each perspective has isolated database, no filtering needed - Fixed 11 test queries in surreal_service/mod.rs 2. Make status parsing case-insensitive in SurrealLink conversion - Now handles 'Shared'/'shared' and 'Local'/'local' correctly - Preserves migrated data regardless of case 3. Require author/timestamp in get_link() signature - Changed from Option<&str> to &str for both params - Removed fallback branch (always use full unique constraint) - Updated 4 callsites in perspective_instance.rs - Enforces UNIQUE index (in, out, predicate, author, timestamp) Co-authored-by: CodeRabbit AI <coderabbit@example.com>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
rust-executor/src/perspectives/perspective_instance.rs (1)
4174-4216: Persist to SurrealDB before updating Prolog to keep source-of-truth consistent.Right now Line 4212 updates Prolog before Line 4214 persists. If SurrealDB persistence fails, Prolog/pubsub can diverge from the source of truth. Align this with the other write paths by persisting first.
✅ Suggested fix
- self.update_prolog_engines(combined_diff.clone()).await; - - self.persist_link_diff(&combined_diff) - .await - .expect("SurrealDB link persistence failed - critical synchronization error"); + self.persist_link_diff(&combined_diff) + .await + .expect("SurrealDB link persistence failed - critical synchronization error"); + + self.update_prolog_engines(combined_diff.clone()).await;
🤖 Fix all issues with AI agents
In `@rust-executor/src/surreal_service/mod.rs`:
- Around line 782-804: The docs for get_link are stale: they claim author and
timestamp are optional but the signature accepts &str; update the comment for
the get_link method to remove/replace wording about optional author/timestamp
and the behavior that they are matched only when present, and instead clearly
state that author: &str and timestamp: &str are required string parameters that
must be provided and are used in the query; reference the get_link function and
its parameters (source, predicate, target, author, timestamp) when editing the
docblock.
🧹 Nitpick comments (1)
rust-executor/src/surreal_service/mod.rs (1)
847-951: Consider extracting common result processing into a helper.The
get_links_by_source,get_links_by_target, andget_links_by_predicatemethods share identical result processing logic (JSON conversion, unwrapping, deserialization toSurrealLink, and mapping toDecoratedLinkExpression).♻️ Example helper extraction
/// Helper to convert SurrealDB query results to DecoratedLinkExpression async fn process_link_results(&self, results: surrealdb::Response) -> Result<Vec<DecoratedLinkExpression>, Error> { let mut response = results; let result: SurrealValue = response.take(0)?; let json_string = serde_json::to_string(&result)?; let json_value: Value = serde_json::from_str(&json_string)?; let unwrapped = unwrap_surreal_json(json_value); if let Value::Array(arr) = unwrapped { let links: Vec<DecoratedLinkExpression> = arr .into_iter() .filter_map(|value| serde_json::from_value::<SurrealLink>(value).ok()) .map(|surreal_link| surreal_link.into()) .collect(); return Ok(links); } Ok(vec![]) }Then each
get_links_by_*method becomes:pub async fn get_links_by_source(&self, _perspective_uuid: &str, source: &str) -> Result<Vec<DecoratedLinkExpression>, Error> { let results = self.db.query("SELECT * FROM link WHERE source = $source") .bind(("source", source.to_string())) .await?; self.process_link_results(results).await }
| /// Get a specific link by its unique constraint fields | ||
| /// | ||
| /// # Arguments | ||
| /// * `_perspective_uuid` - UUID of the perspective (unused, for API consistency) | ||
| /// * `source` - Source address of the link | ||
| /// * `predicate` - Optional predicate of the link (defaults to empty string if None) | ||
| /// * `target` - Target address of the link | ||
| /// * `author` - Optional author DID to match (if None, only matches source/predicate/target) | ||
| /// * `timestamp` - Optional timestamp to match (if None, only matches source/predicate/target) | ||
| /// | ||
| /// # Returns | ||
| /// * `Ok(Some(DecoratedLinkExpression))` - The matching link if found | ||
| /// * `Ok(None)` - No matching link found | ||
| /// * `Err(Error)` - Database query error | ||
| pub async fn get_link( | ||
| &self, | ||
| _perspective_uuid: &str, | ||
| source: &str, | ||
| predicate: Option<&str>, | ||
| target: &str, | ||
| author: &str, | ||
| timestamp: &str, | ||
| ) -> Result<Option<DecoratedLinkExpression>, Error> { |
There was a problem hiding this comment.
Fix stale documentation for get_link parameters.
The doc comments describe author and timestamp as optional, but the actual function signature requires them as &str. This documentation is outdated from when these parameters were Option<&str>.
📝 Proposed fix
- /// * `author` - Optional author DID to match (if None, only matches source/predicate/target)
- /// * `timestamp` - Optional timestamp to match (if None, only matches source/predicate/target)
+ /// * `author` - Author DID of the link
+ /// * `timestamp` - Timestamp of the link🤖 Prompt for AI Agents
In `@rust-executor/src/surreal_service/mod.rs` around lines 782 - 804, The docs
for get_link are stale: they claim author and timestamp are optional but the
signature accepts &str; update the comment for the get_link method to
remove/replace wording about optional author/timestamp and the behavior that
they are matched only when present, and instead clearly state that author: &str
and timestamp: &str are required string parameters that must be provided and are
used in the query; reference the get_link function and its parameters (source,
predicate, target, author, timestamp) when editing the docblock.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
rust-executor/src/perspectives/perspective_instance.rs (1)
4207-4227: Persist to SurrealDB before updating Prolog/pubsub to keep source-of-truth consistent.If persistence fails, Prolog and subscribers may observe changes that never reach storage. Persist first, then update derived systems.
🔧 Suggested ordering
- // Update both Prolog engines: subscription (immediate) + query (lazy) - self.update_prolog_engines(combined_diff.clone()).await; - - self.persist_link_diff(&combined_diff) - .await - .expect("SurrealDB link persistence failed - critical synchronization error"); + // Persist first (source of truth), then update derived systems + self.persist_link_diff(&combined_diff).await?; + self.update_prolog_engines(combined_diff.clone()).await;
🤖 Fix all issues with AI agents
In `@rust-executor/src/perspectives/mod.rs`:
- Around line 153-158: Replace the panic-causing .expect() on
SurrealDBService::new with propagated error handling to match the function's
Result return style: call crate::surreal_service::SurrealDBService::new("ad4m",
&handle.uuid, data_path.as_deref()).await and map any error into a String (e.g.
.map_err(|e| format!("Failed to create SurrealDB service for perspective: {}",
e))?) so the error is returned instead of panicking; this mirrors how the
subsequent migration call propagates errors.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@rust-executor/src/perspectives/mod.rs`:
- Around line 84-106: Replace the panicking .expect() on SurrealDBService::new
with graceful error handling consistent with the migration block: call
SurrealDBService::new(...).await and if it returns Err, log an error including
the perspective id (handle_clone.uuid) and the error, then return early from the
spawned task (same behavior as
migration::migrate_links_from_rusqlite_to_surrealdb handling); mirror the
logging style used in the migration block so failures are surfaced instead of
panicking (see add_perspective() for propagation style).
Complements commit e57b8f8 which fixed the same issue in add_perspective(). This fix addresses the spawned task in initialize_from_db() which also had a panicking .expect() call. Changes: - Replace .expect() with match expression - Log error and return early on failure - Prevents panic if SurrealDB creation fails (RocksDB lock, disk, permissions) Addresses CodeRabbit feedback on PR #652 (line 90 issue)
|
You have run out of free Bugbot PR reviews for this billing cycle. This will reset on March 4. To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial. |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
rust-executor/src/perspectives/perspective_instance.rs (1)
4190-4233:⚠️ Potential issue | 🟠 MajorPersist before Prolog/pubsub to avoid partial updates.
Right now Prolog updates fire before the DB write; if persistence fails, subscribers can observe state that isn’t durable. Persist first, then update Prolog/pubsub.
🛠️ Proposed fix
- self.update_prolog_engines(combined_diff.clone()).await; - - self.persist_link_diff(&combined_diff) - .await - .expect("SurrealDB link persistence failed - critical synchronization error"); + self.persist_link_diff(&combined_diff).await?; + self.update_prolog_engines(combined_diff.clone()).await;
🤖 Fix all issues with AI agents
In `@rust-executor/Cargo.toml`:
- Line 137: The Cargo.toml dependency enables SurrealDB with the kv-rocksdb
feature which requires a C/C++11-capable compiler and LLVM/clang for bindgen;
update CI and developer docs to ensure those toolchains are installed on runners
(or switch CI to use the kv-mem feature), and add guidance to set LIBCLANG_PATH
or LLVM_CONFIG_PATH when autodetection fails; reference the surrealdb dependency
line and the kv-rocksdb feature in Cargo.toml and ensure CI job images/install
steps include a C/C++ compiler and LLVM toolchain so cargo build/bindgen
succeed.
In `@rust-executor/src/perspectives/perspective_instance.rs`:
- Around line 912-949: The code currently calls
self.persist_link_diff(...).await.expect(...) which panics on SurrealDB write
failures; replace the expect() with proper error propagation (use the ? operator
or map_err to convert to the function's Result error type) so the caller can
handle/retry, updating the enclosing async function's signature/return type if
needed; apply the same change to all other calls to persist_link_diff in
PerspectiveInstance (and related methods) so failures are propagated instead of
panicking (see methods using persist_link_diff, update_prolog_engines,
pubsub_publish_diff, and spawn_commit_and_handle_error in
perspective_instance.rs).
In `@rust-executor/src/surreal_service/mod.rs`:
- Around line 773-777: In get_all_links (and similarly in get_link and all
get_links_by_* methods) change the silent filter_map(|value|
serde_json::from_value::<SurrealLink>(value).ok()) behavior so deserialization
failures are not dropped silently: attempt serde_json::from_value on each
result, if Ok(converted) push converted.into() into links, and if Err(e) emit a
warning-level log that includes the error (e) and the offending value (or an
identifying id) so schema drift/corruption is visible; keep collecting
successful conversions into the final Vec<DecoratedLinkExpression>. Ensure you
reference the SurrealLink type, the results iterator, and the
DecoratedLinkExpression conversion logic when making this change so it is
applied consistently across get_all_links, get_link, and the get_links_by_*
methods.
🧹 Nitpick comments (3)
rust-executor/src/surreal_service/mod.rs (3)
584-588:remove_linkcreates orphan nodes as a side effect viaensure_node.
ensure_nodecreates a node record when one doesn't exist. Calling it during link removal means that removing a link whose source/target nodes were already cleaned up (or never existed) will create new orphan node records. Since the node ID is a deterministic SHA-256 hash of the URI, you can compute the ID directly without the create-if-absent behaviour.Proposed helper to compute node ID without side effects
+ /// Compute the deterministic node ID for a URI without creating it in the DB. + fn node_id_for_uri(uri: &str) -> String { + use sha2::{Digest, Sha256}; + let mut hasher = Sha256::new(); + hasher.update(uri.as_bytes()); + let hash = format!("{:x}", hasher.finalize()); + format!("node:{}", &hash[..16]) + } + pub async fn remove_link( &self, _perspective_uuid: &str, link: &DecoratedLinkExpression, ) -> Result<(), Error> { - let source_id = self.ensure_node(&link.data.source).await?; - let target_id = self.ensure_node(&link.data.target).await?; + let source_id = Self::node_id_for_uri(&link.data.source); + let target_id = Self::node_id_for_uri(&link.data.target);This also avoids unnecessary write retries and DB round-trips on a removal path.
765-780:get_all_linksroutes throughquery_links(with timeout/logging overhead) while the otherget_links_by_*methods querydbdirectly.This is an inconsistency —
get_all_linkspays the cost of a 60-second timeout wrapper, logging-task spawn, and readonly validation for a hardcodedSELECT *, while the sibling methods (get_links_by_source, etc.) skip all of that. Consider aligning the approach: either all helpers go direct, or all go throughquery_links.
847-951: Significant duplication acrossget_links_by_source,get_links_by_target, andget_links_by_predicate.These three methods (and
get_link) share identical boilerplate:db.query→response.take(0)→ JSON round-trip →unwrap_surreal_json→filter_mapdeserialize. Consider extracting a private helper like:async fn query_as_links(&self, query: &str, bindings: Vec<(&str, String)>) -> Result<Vec<DecoratedLinkExpression>, Error> { ... }This would reduce ~100 lines of duplicated conversion logic to single call sites.
| deno_error = "0.5.6" | ||
| thiserror = "2.0.12" | ||
| surrealdb = { version = "2.4", default-features = false, features = ["kv-mem", "scripting"] } | ||
| surrealdb = { version = "2.4", default-features = false, features = ["kv-rocksdb", "kv-mem", "scripting"] } |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
SurrealDB 2.4 kv-rocksdb feature RocksDB build requirements
💡 Result:
For SurrealDB 2.4.x, enabling the RocksDB storage engine is done via the Cargo feature kv-rocksdb (e.g. cargo build --features surrealdb/kv-rocksdb). [1]
What you need installed to build with kv-rocksdb
Because this feature compiles native RocksDB bindings, you generally need:
- A C/C++ build toolchain (compiler + linker; RocksDB is C++). RocksDB requires a C++11-capable compiler (e.g., GCC/Clang). [5]
- Clang/LLVM (libclang) available for bindgen (the RocksDB Rust bindings require this). [3][4]
- On macOS/Linux you may need to set
LIBCLANG_PATH/LLVM_CONFIG_PATHif autodetection fails (common when compilinglibrocksdb-sys). [6]
- On macOS/Linux you may need to set
- (Platform packaging hint) Some distributions/ports express this as needing
llvm-configat build time and linking againstlibrocksdb(and oftenzstd). [2]
Optional / common native deps
- RocksDB can be built with optional compression libraries such as snappy, lz4, zstd, zlib, bzip2. Whether they’re needed depends on how the Rust RocksDB wrapper is configured/features enabled, but they’re common native deps in RocksDB builds. [5]
Notes
- If you enable RocksDB LTO in the RocksDB wrapper, it specifically requires building with clang and matching LLVM versions/flags. [3]
Sources: SurrealDB blog/docs and crate docs for enabling kv-rocksdb and bindgen/libclang requirements, plus RocksDB install notes and packaging metadata. [1][2][3][4][5][6]
🌐 Web query:
SurrealDB cargo features kv-rocksdb kv-mem scripting compatibility 2.4
💡 Result:
For SurrealDB Rust crates in the 2.4.x line, the Cargo feature flags you mentioned are present and compatible:
kv-mem: in-memory storage engine (and it’s the default forsurrealdb-core 2.4.1). (docs.rs)kv-rocksdb: RocksDB-backed storage engine. (Enabling embedded engines requires turning on the correspondingkv-*feature; RocksDB also brings native build deps.) (docs.rs)scripting: enables SurrealDB’s scripting support (wired to ajsdependency insurrealdb-core). (docs.rs)
In the top-level surrealdb crate, these features map through to surrealdb-core (e.g., scripting = ["surrealdb-core/scripting"], kv-rocksdb = ["surrealdb-core/kv-rocksdb", ...]). (docs.rs)
Example Cargo.toml (2.4.x) enabling all three:
[dependencies]
surrealdb = { version = "2.4", default-features = false, features = ["kv-mem", "kv-rocksdb", "scripting"] }Citations:
- 1: https://docs.rs/crate/surrealdb-core/2.4.1/source/Cargo.toml.orig?utm_source=openai
- 2: https://docs.rs/crate/surrealdb-core/2.4.1/source/Cargo.toml.orig?utm_source=openai
- 3: https://docs.rs/crate/surrealdb-core/2.4.1/source/Cargo.toml.orig?utm_source=openai
- 4: https://docs.rs/crate/surrealdb/latest/source/Cargo.toml.orig?utm_source=openai
Ensure CI has C/C++ compiler and LLVM/clang for building with kv-rocksdb.
The feature combination is supported in SurrealDB 2.4, but enabling kv-rocksdb requires native build dependencies: a C++11-capable C/C++ compiler and LLVM/clang (for bindgen to compile RocksDB bindings). Verify that CI runners and local build environments have these toolchains available; you may also need to set LIBCLANG_PATH or LLVM_CONFIG_PATH if autodetection fails.
🤖 Prompt for AI Agents
In `@rust-executor/Cargo.toml` at line 137, The Cargo.toml dependency enables
SurrealDB with the kv-rocksdb feature which requires a C/C++11-capable compiler
and LLVM/clang for bindgen; update CI and developer docs to ensure those
toolchains are installed on runners (or switch CI to use the kv-mem feature),
and add guidance to set LIBCLANG_PATH or LLVM_CONFIG_PATH when autodetection
fails; reference the surrealdb dependency line and the kv-rocksdb feature in
Cargo.toml and ensure CI job images/install steps include a C/C++ compiler and
LLVM toolchain so cargo build/bindgen succeed.
| // Query SurrealDB instead of Rusqlite | ||
| if let Some(decorated_link) = self | ||
| .surreal_service | ||
| .get_link( | ||
| &handle.uuid, | ||
| &link_expression.data.source, | ||
| link_expression.data.predicate.as_deref(), | ||
| &link_expression.data.target, | ||
| &link_expression.author, | ||
| &link_expression.timestamp, | ||
| ) | ||
| .await? | ||
| { | ||
| Ad4mDb::with_global_instance(|db| db.remove_link(&handle.uuid, &link_expression))?; | ||
| let link_from_db = LinkExpression::from(decorated_link.clone()); | ||
| let status = decorated_link.status.clone().unwrap_or(LinkStatus::Local); | ||
|
|
||
| let diff = PerspectiveDiff::from_removals(vec![link_expression.clone()]); | ||
| let decorated_link = DecoratedLinkExpression::from((link_from_db, status.clone())); | ||
| let decorated_link_result = | ||
| DecoratedLinkExpression::from((link_from_db, status.clone())); | ||
| let decorated_diff = | ||
| DecoratedPerspectiveDiff::from_removals(vec![decorated_link.clone()]); | ||
| DecoratedPerspectiveDiff::from_removals(vec![decorated_link_result.clone()]); | ||
|
|
||
| // Remove from SurrealDB (primary storage) | ||
| self.persist_link_diff(&decorated_diff) | ||
| .await | ||
| .expect("SurrealDB link persistence failed - critical synchronization error"); | ||
|
|
||
| // Update both Prolog engines: subscription (immediate) + query (lazy) | ||
| self.update_prolog_engines(decorated_diff.clone()).await; | ||
|
|
||
| self.update_surreal_cache(&decorated_diff).await; | ||
| self.pubsub_publish_diff(decorated_diff.clone()).await; | ||
|
|
||
| if status == LinkStatus::Shared { | ||
| self.spawn_commit_and_handle_error(&diff); | ||
| } | ||
|
|
||
| Ok(decorated_link) | ||
| Ok(decorated_link_result) | ||
| } else { |
There was a problem hiding this comment.
Avoid panicking on SurrealDB write failure.
This path is user-facing; expect() will crash the executor on transient IO/lock errors. Prefer propagating the error so callers can handle/retry, and apply the same change in other persist_link_diff call sites.
🛠️ Proposed fix
- self.persist_link_diff(&decorated_diff)
- .await
- .expect("SurrealDB link persistence failed - critical synchronization error");
+ self.persist_link_diff(&decorated_diff).await?;🤖 Prompt for AI Agents
In `@rust-executor/src/perspectives/perspective_instance.rs` around lines 912 -
949, The code currently calls self.persist_link_diff(...).await.expect(...)
which panics on SurrealDB write failures; replace the expect() with proper error
propagation (use the ? operator or map_err to convert to the function's Result
error type) so the caller can handle/retry, updating the enclosing async
function's signature/return type if needed; apply the same change to all other
calls to persist_link_diff in PerspectiveInstance (and related methods) so
failures are propagated instead of panicking (see methods using
persist_link_diff, update_prolog_engines, pubsub_publish_diff, and
spawn_commit_and_handle_error in perspective_instance.rs).
| let links: Vec<DecoratedLinkExpression> = results | ||
| .into_iter() | ||
| .filter_map(|value| serde_json::from_value::<SurrealLink>(value).ok()) | ||
| .map(|surreal_link| surreal_link.into()) | ||
| .collect(); |
There was a problem hiding this comment.
Silent deserialization failures are dropped via filter_map(|v| ... .ok()).
This pattern appears in get_all_links (line 775), get_link (line 829), and all get_links_by_* methods. If a stored row can't be deserialized into SurrealLink (e.g., schema drift, corrupted data from migration), it's silently excluded from results with no log or error. Consider logging a warning on deserialization failure so data issues don't go unnoticed.
Example fix for get_all_links (apply similarly elsewhere)
let links: Vec<DecoratedLinkExpression> = results
.into_iter()
- .filter_map(|value| serde_json::from_value::<SurrealLink>(value).ok())
+ .filter_map(|value| {
+ match serde_json::from_value::<SurrealLink>(value.clone()) {
+ Ok(link) => Some(link),
+ Err(e) => {
+ log::warn!("Failed to deserialize SurrealLink: {:?} from value: {:?}", e, value);
+ None
+ }
+ }
+ })
.map(|surreal_link| surreal_link.into())
.collect();📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| let links: Vec<DecoratedLinkExpression> = results | |
| .into_iter() | |
| .filter_map(|value| serde_json::from_value::<SurrealLink>(value).ok()) | |
| .map(|surreal_link| surreal_link.into()) | |
| .collect(); | |
| let links: Vec<DecoratedLinkExpression> = results | |
| .into_iter() | |
| .filter_map(|value| { | |
| match serde_json::from_value::<SurrealLink>(value.clone()) { | |
| Ok(link) => Some(link), | |
| Err(e) => { | |
| log::warn!("Failed to deserialize SurrealLink: {:?} from value: {:?}", e, value); | |
| None | |
| } | |
| } | |
| }) | |
| .map(|surreal_link| surreal_link.into()) | |
| .collect(); |
🤖 Prompt for AI Agents
In `@rust-executor/src/surreal_service/mod.rs` around lines 773 - 777, In
get_all_links (and similarly in get_link and all get_links_by_* methods) change
the silent filter_map(|value| serde_json::from_value::<SurrealLink>(value).ok())
behavior so deserialization failures are not dropped silently: attempt
serde_json::from_value on each result, if Ok(converted) push converted.into()
into links, and if Err(e) emit a warning-level log that includes the error (e)
and the offending value (or an identifying id) so schema drift/corruption is
visible; keep collecting successful conversions into the final
Vec<DecoratedLinkExpression>. Ensure you reference the SurrealLink type, the
results iterator, and the DecoratedLinkExpression conversion logic when making
this change so it is applied consistently across get_all_links, get_link, and
the get_links_by_* methods.
Summary
This PR migrates perspective link storage from in-memory SurrealDB to persistent file-based SurrealDB databases using RocksDB. Each perspective now has its own dedicated database file, eliminating data duplication and providing persistence across restarts.
Motivation
Previous Architecture:
New Architecture:
{app_data_path}/surrealdb_perspectives/{perspective_uuid}/Key Changes
Storage Layer (
src/surreal_service/mod.rs)Memengine toRocksDbengine with file-based storageadd_link()idempotent (handles concurrent writes gracefully)remove_link()precision (includes author + timestamp)Perspective Management (
src/perspectives/mod.rs,perspective_instance.rs)get_links_local(),add_link(),remove_link()to use SurrealDBMigration System (
src/perspectives/migration.rs⭐ NEW)Self-contained one-time migration module:
Database Support (
src/db.rs)Added migration tracking:
perspective_link_migrationis_perspective_migrated(),mark_perspective_as_migrated(),delete_all_links_for_perspective()On first startup after upgrade:
DB File Structure
{app_data_path}/
└── surrealdb_perspectives/
├── {perspective-uuid-1}/ # RocksDB database files
├── {perspective-uuid-2}/
└── {perspective-uuid-3}/
Future Cleanup
The migration code (src/perspectives/migration.rs) can be removed after all users have upgraded (recommended: 2-3 releases from now). See MIGRATION_REMOVAL_GUIDE.md for removal instructions.
Summary by CodeRabbit
New Features
Bug Fixes
Documentation