Skip to content

Migrate Notification System from Prolog to SurrealDB#648

Merged
lucksus merged 10 commits intodevfrom
surreal-notifications
Jan 23, 2026
Merged

Migrate Notification System from Prolog to SurrealDB#648
lucksus merged 10 commits intodevfrom
surreal-notifications

Conversation

@lucksus
Copy link
Member

@lucksus lucksus commented Jan 22, 2026

This PR migrates the notification trigger system from Prolog queries to SurrealQL queries, enabling notifications to work in SdnaOnly mode (which saves RAM by not loading Prolog facts).

Why This Change?

Previously, notifications relied on Prolog queries that analyzed facts loaded into memory. When ad4m switched to SdnaOnly mode to reduce memory usage, the notification system broke because Prolog facts weren't available. This PR replaces Prolog with SurrealQL queries that run directly against the SurrealDB instances where perspective data is stored.

Key Changes

1. Added Custom SurrealDB Functions

Added four custom functions to enable complex notification queries:

  • fn::parse_literal($url) - Parses literal URLs (e.g., literal://string:Hello%20World"Hello World")
  • fn::json_path($obj, $path) - Navigate JSON objects by dot-separated path (e.g., fn::json_path(data, 'user.name'))
  • fn::contains($str, $substring) - String contains check
  • fn::strip_html($html) - Remove HTML tags using regex
  • fn::regex_match($str, $pattern) - Regex pattern matching

Location: /rust-executor/src/surreal_service/mod.rs:282-402

2. Context Variable Injection

Notification queries can now use two context variables that are automatically injected:

  • $agentDid - The agent's DID (e.g., did:key:z6Mks...)
  • $perspectiveId - The perspective UUID being queried

These are injected via string replacement before query execution.

Location: /rust-executor/src/perspectives/perspective_instance.rs:2301-2338

3. Query Validation

Added validation at notification creation/update time to ensure queries are safe:

  • Must start with SELECT/RETURN/LET/WITH
  • Cannot contain mutating operations (INSERT, UPDATE, DELETE, etc.)
  • Must have balanced parentheses
  • Length limit of 10,000 characters

Location: /rust-executor/src/db.rs:456-542

4. Changed Internal Types

  • Notification matches now return Vec<serde_json::Value> instead of Vec<QueryMatch>
  • Removed Prolog-specific types from notification code
  • Updated all notification matching methods to work with JSON

Location: /rust-executor/src/perspectives/perspective_instance.rs:2524-2633

Migration Guide: Flux Notification Example

Before (Prolog):

agent_did(Did),
subject_class("Message", C),
instance(C, Base),
property_getter(C, Base, "body", Body),
literal_from_url(Body, JsonString, _),
json_property(JsonString, "data", MessageContent),
string_includes(MessageContent, MentionString),
remove_html_tags(MessageContent, Description)

After (SurrealQL):

SELECT
    in.uri as message_id,
    fn::parse_literal(out.uri) as body_literal,
    fn::json_path(fn::parse_literal(out.uri), 'data') as message_content,
    fn::strip_html(
        fn::json_path(fn::parse_literal(out.uri), 'data')
    ) as description,
    $agentDid as mentioned_agent
FROM link
WHERE predicate = 'msg://body'
    AND fn::contains(
        fn::json_path(fn::parse_literal(out.uri), 'data'),
        'data-type="mention" href="' + $agentDid + '"'
    )

Key Differences:

  1. Standard SQL syntax - Use SELECT ... FROM link WHERE ... instead of Prolog predicates
  2. Explicit field selection - Name each field you want in the result using as alias
  3. Function calls - Use fn::function_name() for custom functions
  4. Context variables - Use $agentDid and $perspectiveId directly in queries
  5. Link graph traversal - Use in.uri for link source, out.uri for link target

Extracting Multiple Data Points

You can select multiple computed fields in one query:

SELECT
    source as message_id,
    fn::parse_literal(target) as message_content,
    fn::strip_html(fn::parse_literal(target)) as plain_text,
    $agentDid as mentioned_agent,
    $perspectiveId as perspective_id
FROM link
WHERE predicate = 'rdf://content'
    AND fn::contains(fn::parse_literal(target), $agentDid)

Each match object will contain all these fields:

{
  "message_id": "message://2",
  "message_content": "<p>Hey <strong>did:key:z6Mks...</strong>, how are you?</p>",
  "plain_text": "Hey did:key:z6Mks..., how are you?",
  "mentioned_agent": "did:key:z6Mks...",
  "perspective_id": "3b7104cb-cec0-4d05-a968-c566936dd289"
}

Breaking Changes

  • Notification trigger format: Apps using notifications must update their trigger queries from Prolog to SurrealQL
  • Match result structure: The triggerMatch field now contains JSON objects with explicitly selected fields instead of Prolog variable bindings

Known Limitations

Multi-user support will be addressed in a follow-up PR to ensure notifications work correctly with managed users and per-user agent DIDs.

Summary by CodeRabbit

  • New Features

    • Notifications can use SELECT-style query triggers and now return JSON-formatted trigger results.
    • SurrealDB helper functions added: HTML stripping, JSON path extraction, substring and regex checks.
  • Bug Fixes / Refactor

    • Validation added for notification triggers on create/update to reject unsafe or malformed queries.
    • Internal notification matches normalized to JSON with consistent lower-case fields.
  • Tests

    • Notification tests updated, re-enabled, and extended to cover new trigger formats, string-literal handling, and richer match extraction.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 22, 2026

📝 Walkthrough

Walkthrough

Adds validation for notification SurrealQL in the DB, integrates a context-aware SurrealQL executor returning JSON in perspective instance, introduces four SurrealDB scripting helpers, converts notification match flows to JSON values, and updates JS tests to use SELECT-based triggers and new JSON shapes.

Changes

Cohort / File(s) Summary
Query Validation
rust-executor/src/db.rs
Adds private validate_notification_query enforcing non-empty, max length 10000, allowed starters (SELECT, RETURN, LET, WITH), rejects standalone mutating tokens (INSERT, UPDATE, DELETE, CREATE, DROP, REMOVE, DEFINE, ALTER, RELATE, BEGIN, COMMIT, CANCEL), checks balanced parentheses with basic string-literal awareness; called from add_notification and update_notification. Note: helper duplicated in two locations.
Perspective SurrealQL & JSON Notification Flow
rust-executor/src/perspectives/perspective_instance.rs
Adds pub async fn surreal_query_notification(query: String) -> Result<Vec<serde_json::Value>, AnyError> which injects $agentDid and $perspectiveId and executes via SurrealDB service; refactors notification match pipelines, caches, snapshots, subtraction, and publish paths to use Vec<serde_json::Value> (JSON) instead of typed QueryMatch, and emits JSON-string trigger_match in published notifications.
SurrealDB Utility Functions
rust-executor/src/surreal_service/mod.rs
Adds four DEFINE FUNCTION IF NOT EXISTS SurrealDB functions in init: strip_html(html), json_path(obj, path), contains(str, substring), and regex_match(str, pattern).
Tests (JS runtime)
tests/js/tests/runtime.ts
Replaces literal triggers with SELECT queries, updates assertions for conditional checks and lower-cased JSON fields (source/target), re-enables notification trigger test, and adds Flux-style tests validating multiple extracted fields (mentions, HTML handling, agent/perspective IDs).

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant DB as Database Layer
    participant Validator as Query Validator
    participant SurrealDB

    Client->>DB: add_notification(notification)
    DB->>Validator: validate_notification_query(trigger)
    alt Query Valid
        Validator-->>DB: Ok(())
        DB->>SurrealDB: INSERT notification
        SurrealDB-->>DB: Success
        DB-->>Client: Notification Added
    else Query Invalid
        Validator-->>DB: Err(error_message)
        DB-->>Client: InvalidParameterName Error
    end
Loading
sequenceDiagram
    participant Client
    participant Perspective as PerspectiveInstance
    participant Surreal as SurrealDB Service

    Client->>Perspective: surreal_query_notification(query)
    Perspective->>Perspective: Inject $agentDid (rgba(0,128,0,0.5))
    Perspective->>Perspective: Inject $perspectiveId (rgba(0,128,255,0.5))
    Perspective->>Surreal: Execute(enriched_query)
    Surreal-->>Perspective: JSON results (Vec<serde_json::Value>)
    Perspective->>Perspective: Log and normalize
    Perspective-->>Client: Vec<serde_json::Value>
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐰 I nibble queries, tidy and neat,
I tuck in context, make JSON complete,
I strip the tags and follow the path,
I hop through tests and dodge any wrath,
A little rabbit cheering code with a beat ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 76.47% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Migrate Notification System from Prolog to SurrealDB' accurately reflects the main objective of this pull request, which involves a systematic migration of the notification trigger system from Prolog-based queries to SurrealQL against SurrealDB instances.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
rust-executor/src/db.rs (1)

2136-2151: Imported notifications bypass query validation.

The import_from_json method inserts notification triggers directly without calling validate_notification_query. A malicious or corrupted import file could inject mutating queries that bypass the security checks added in add_notification and update_notification.

🔒 Proposed fix
                     for notification in notifications {
                         let id = notification
                             .get("id")
                             .and_then(|id| id.as_str())
                             .unwrap_or("<unknown>");
+                        
+                        // Validate trigger query before importing
+                        let trigger = notification["trigger"].as_str().unwrap_or("");
+                        if let Err(e) = Self::validate_notification_query(trigger) {
+                            result.notifications.failed += 1;
+                            result.notifications.errors.push(format!(
+                                "Invalid trigger query in notification {}: {}", id, e
+                            ));
+                            log::warn!("Skipping notification {} with invalid trigger: {}", id, e);
+                            continue;
+                        }
+
                         match self.conn.execute(
tests/js/tests/runtime.ts (1)

261-261: Missing await before sleep(1000).

The sleep(1000) call on line 261 is missing await, meaning the code won't actually wait before calling grantNotification. This could potentially cause timing issues if the notification hasn't been fully registered.

Proposed fix
             // Request to install a new notification
             const notificationId = await ad4mClient.runtime.requestInstallNotification(notification);
-            sleep(1000)
+            await sleep(1000)
             // Grant the notification
             const granted = await ad4mClient.runtime.grantNotification(notificationId)
🤖 Fix all issues with AI agents
In `@rust-executor/src/db.rs`:
- Around line 525-539: The current parentheses balance check (using paren_count
and iterating over query_trimmed.chars()) treats parentheses inside string
literals as real parentheses and returns false positives; update the loop to
skip counting parentheses when inside string literals by tracking an in_string
flag (handle both single and double quotes), respect escaped quotes (ignore
quote toggles when preceded by backslash), and only increment/decrement
paren_count when not in_string; adjust logic around quote handling in the same
block so queries like "SELECT ... = '(test)'" do not fail.

In `@rust-executor/src/surreal_service/mod.rs`:
- Around line 378-393: The fn::regex_match function constructs RegExp from
user-supplied patterns which can enable ReDoS; add explicit validation before
new RegExp to guard against malicious patterns: in the fn::regex_match
implementation, check pattern length (e.g., max 100–200 chars), reject or
sanitize patterns containing known catastrophic constructs (e.g.,
nested/adjacent quantifiers like (a+)+, backtracking groups, excessive nested
parentheses or lookaround combos) and optionally maintain a whitelist of allowed
simple patterns; if validation fails, return false (or a controlled error)
instead of attempting new RegExp, retaining the existing try/catch as a
secondary safeguard.
🧹 Nitpick comments (3)
rust-executor/src/perspectives/perspective_instance.rs (2)

2314-2319: Harden placeholder substitution to avoid accidental replacements.

Blind .replace() can also hit $agentDid inside quoted strings or longer identifiers, producing hard-to-debug queries. Consider normalizing quoted placeholders (or validating unquoted usage) before substitution so queries like '$agentDid' don’t end up double-quoted.

🔧 Possible refinement (handle quoted placeholders first)
-        let query_with_context = query
-            .replace("$agentDid", &format!("'{}'", agent_did))
-            .replace("$perspectiveId", &format!("'{}'", perspective_id));
+        let query_with_context = query
+            .replace("'$agentDid'", &format!("'{}'", agent_did))
+            .replace("$agentDid", &format!("'{}'", agent_did))
+            .replace("'$perspectiveId'", &format!("'{}'", perspective_id))
+            .replace("$perspectiveId", &format!("'{}'", perspective_id));

2611-2642: Consider an O(n) diff for large match sets.

The nested scan is O(n²) per notification. If matches can grow, precomputing a lookup set (e.g., via canonical JSON) will scale better.

♻️ Sketch of an O(n) approach
-                    if let Some(before_matches) = before.get(notification) {
-                        // Find matches that exist in "after" but not in "before"
-                        after_matches
-                            .iter()
-                            .filter(|after_match| {
-                                !before_matches.iter().any(|before_match| {
-                                    before_match == *after_match
-                                })
-                            })
-                            .cloned()
-                            .collect()
-                    } else {
+                    if let Some(before_matches) = before.get(notification) {
+                        let before_set: std::collections::HashSet<String> = before_matches
+                            .iter()
+                            .filter_map(|m| serde_json::to_string(m).ok())
+                            .collect();
+                        after_matches
+                            .iter()
+                            .filter(|after_match| {
+                                serde_json::to_string(after_match)
+                                    .map(|s| !before_set.contains(&s))
+                                    .unwrap_or(true)
+                            })
+                            .cloned()
+                            .collect()
+                    } else {
rust-executor/src/surreal_service/mod.rs (1)

338-360: Consider handling array index access in fn::json_path.

The current implementation only handles dot-separated object property access. If users need to access array elements (e.g., items.0.name or items[0].name), this would silently return null.

This may be intentional for the current use case, but worth documenting or extending if array traversal is needed.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@rust-executor/src/db.rs`:
- Around line 532-540: The escape handling in the loop over
query_trimmed.chars() incorrectly treats doubled backslashes as escaping the
next character; update the logic around the escaped boolean in that loop (the
for c in query_trimmed.chars() block) so that encountering a '\\' toggles
escaped (e.g., if '\\' and not currently escaped set escaped=true, but if '\\'
and currently escaped set escaped=false and treat it as a literal backslash)
instead of always marking the next char escaped; alternatively, move/limit
escape-state tracking to only apply when inside a string/quote context (track
inside_string and only flip escaped while inside), ensuring escaped is cleared
appropriately after consuming an escaped character.

In `@rust-executor/src/perspectives/perspective_instance.rs`:
- Around line 2299-2348: In surreal_query_notification, the code injects raw
agent_did and perspective_id into query_with_context which breaks SurrealQL if
those strings contain ' or \ and can cause unintended token substitution; fix by
escaping backslashes and single quotes in agent_did and perspective_id (e.g.,
replace '\' -> '\\' and '\'' -> '\\' ), then wrap each escaped value in single
quotes and perform targeted replacement of the exact tokens "$agentDid" and
"$perspectiveId" into query_with_context (use a small helper like
escape_and_quote(value) and call it for agent_did and perspective_id before
replacing to avoid malformed or accidental replacements).
🧹 Nitpick comments (1)
rust-executor/src/db.rs (1)

480-523: Mutating operation check doesn't exclude string literals.

Queries containing mutating keywords inside string literals (e.g., SELECT * FROM links WHERE data = 'INSERT example') will be incorrectly rejected. This is a conservative approach that prevents injection but may cause false positives.

Consider documenting this limitation or implementing string-literal-aware scanning if this becomes problematic for users.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@rust-executor/src/db.rs`:
- Around line 480-523: The mutating operation scan (mutating_operations loop
over query_upper) falsely flags keywords that occur inside string literals;
modify the scan to reuse the string-literal tracking used later (in_string /
escaped) so matches inside quotes are ignored. Concretely, replace the current
find-based loop with a single-pass over the query characters that maintains
in_string and escaped state (same logic as the parentheses validator), and only
consider an operation match when not in_string and both before_ok/after_ok
word-boundary checks hold; update search_pos/absolute_pos logic to use the same
indexed iteration over query_upper so operations inside quoted sections are
skipped.
- Around line 486-512: The boundary checks use query_upper.chars().nth(...)
while find(...) returns byte offsets, causing misaligned checks for multi-byte
UTF‑8; update the checks in the mutating operations loop (the block using
variables search_pos, absolute_pos, after_pos and query_upper inside the for
operation in &mutating_operations) to operate on bytes: call
query_upper.as_bytes() and index into that byte slice using absolute_pos and
after_pos to read the surrounding byte values, then compare against ASCII
separator bytes (space, tab, newline, carriage return, semicolon, open paren)
and handle bounds as before so the byte-based positions returned by find() match
the checked characters.

@lucksus lucksus merged commit aa27045 into dev Jan 23, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant