fix: reject links with invalid or empty source/target/predicate URIs#676
fix: reject links with invalid or empty source/target/predicate URIs#676
Conversation
Add Link::validate() that enforces RFC 3986 scheme presence ([a-zA-Z][a-zA-Z0-9+\-.]*:) on source, target, and predicate before any link is stored. Called at all four entry points: add_link, add_links, add_link_expression, and link_mutations. Any invalid link now returns an error immediately instead of silently corrupting the perspective graph.
📝 WalkthroughWalkthroughAdds pre-validation and URI normalization for Link data in perspective link creation and mutation flows; validation runs before signing or persistence and tests updated to use ad4m://-prefixed predicates and normalized URIs. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant PerspectiveInstance
participant Validator
participant DB
Client->>PerspectiveInstance: submit add_link / add_links / mutations (Link / LinkExpression)
PerspectiveInstance->>Validator: normalize(s) → validate(s)
alt validation fails
Validator-->>PerspectiveInstance: error
PerspectiveInstance-->>Client: return validation error (abort)
else validation succeeds
Validator-->>PerspectiveInstance: validated Link(s)
PerspectiveInstance->>DB: persist links / execute SurrealDB queries
DB-->>PerspectiveInstance: persistence confirmation
PerspectiveInstance-->>Client: success (signed expressions / result)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@rust-executor/src/types.rs`:
- Around line 111-140: In validate(), the current predicate branch allows an
empty string to bypass URI validation; update the predicate handling in the
validate() method so that when self.predicate is Some(predicate) you first
return an error if predicate.is_empty() (matching the behavior used for
source/target) and otherwise call the existing check closure; reference the
validate function, the local check closure, and the predicate variable to locate
where to add the explicit empty-string rejection.
| /// Validates that source and target are non-empty URIs (RFC 3986 scheme present), | ||
| /// and that predicate, if present, is also a valid URI. | ||
| /// | ||
| /// A valid URI must begin with a scheme: `[a-zA-Z][a-zA-Z0-9+\-.]*:` | ||
| /// Examples: `did:key:alice`, `expression://Qm...`, `neighbourhood://Qm...` | ||
| pub fn validate(&self) -> Result<(), AnyError> { | ||
| use std::sync::OnceLock; | ||
| static URI_SCHEME_RE: OnceLock<Regex> = OnceLock::new(); | ||
| let re = URI_SCHEME_RE.get_or_init(|| Regex::new(r"^[a-zA-Z][a-zA-Z0-9+\-.]*:").unwrap()); | ||
|
|
||
| let check = |field: &str, value: &str| -> Result<(), AnyError> { | ||
| if value.is_empty() { | ||
| return Err(anyhow!("Link {} must not be empty", field)); | ||
| } | ||
| if !re.is_match(value) { | ||
| return Err(anyhow!( | ||
| "Link {} is not a valid URI (must start with a scheme like 'did:', 'expression://', etc.): '{}'", | ||
| field, value | ||
| )); | ||
| } | ||
| Ok(()) | ||
| }; | ||
|
|
||
| check("source", &self.source)?; | ||
| check("target", &self.target)?; | ||
|
|
||
| if let Some(predicate) = &self.predicate { | ||
| if !predicate.is_empty() { | ||
| check("predicate", predicate)?; | ||
| } |
There was a problem hiding this comment.
Reject empty predicate explicitly if it should be invalid.
predicate == "" currently bypasses validation, so empty predicates slip through despite the “reject empty URIs” objective. If empty predicates should be rejected, add an explicit error before calling check().
🛠️ Suggested fix
- if let Some(predicate) = &self.predicate {
- if !predicate.is_empty() {
- check("predicate", predicate)?;
- }
- }
+ if let Some(predicate) = &self.predicate {
+ if predicate.is_empty() {
+ return Err(anyhow!("Link predicate must not be empty"));
+ }
+ check("predicate", predicate)?;
+ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@rust-executor/src/types.rs` around lines 111 - 140, In validate(), the
current predicate branch allows an empty string to bypass URI validation; update
the predicate handling in the validate() method so that when self.predicate is
Some(predicate) you first return an error if predicate.is_empty() (matching the
behavior used for source/target) and otherwise call the existing check closure;
reference the validate function, the local check closure, and the predicate
variable to locate where to add the explicit empty-string rejection.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@rust-executor/src/perspectives/perspective_instance.rs`:
- Around line 3374-3399: The code currently calls
Literal::from_string(...).to_url() and Literal::from_number(...).to_url() with
.expect(...) which can panic; replace those .expect calls with proper error
propagation so the function returns an Err instead of panicking. In the match
arm where serde_json::Value::String(s) and Number(n) are handled, change the
logic to call to_url() and handle the Result via either the ? operator (after
converting the to_url error into the function's error type with map_err/From) or
explicitly match/map_err to return Err with a clear context; keep the existing
fallback for numbers when n.as_f64() is None. Update use sites of
Literal::from_string, Literal::from_number, and to_url to propagate errors
instead of unwrapping.
data-bot-coasys
left a comment
There was a problem hiding this comment.
Good catch from CodeRabbit on the empty predicate case. Looking at the current code:
if let Some(predicate) = &self.predicate {
if !predicate.is_empty() {
check("predicate", predicate)?;
}
}When predicate = Some(""), the outer guard !predicate.is_empty() skips validation entirely — but the check closure already handles empty strings with its own guard. The fix is to remove the redundant outer check so check always runs when predicate is Some:
if let Some(predicate) = &self.predicate {
check("predicate", predicate)?;
}This makes Some("") properly rejected (empty predicate URI is invalid), while None still means "no predicate" which is fine. Simple one-line fix — happy to push it to the branch if you want. 🤖
|
Good catch from CodeRabbit on the empty predicate case. Looking at the current code: if let Some(predicate) = &self.predicate {
if !predicate.is_empty() {
check("predicate", predicate)?;
}
}When if let Some(predicate) = &self.predicate {
check("predicate", predicate)?;
}This makes |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/js/tests/neighbourhood.ts (1)
161-180:⚠️ Potential issue | 🟠 MajorStress test needs scheme-qualified sources, targets, and predicates throughout. The early updates are correct (using
ad4m://root), but the test later adds links with scheme-less values like'alice','bob','broadcast','broadcast-loopback'as sources/targets/predicates (lines 198–200, 222–224, 431, 448, 481, 499, 514). With the new validation, those will be rejected. Update all scheme-less values to scheme URIs (e.g.,'test://alice','test://bob','test://broadcast','test://signal') or confirm they bypass validation.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/js/tests/neighbourhood.ts` around lines 161 - 180, Test uses scheme-less link components which will be rejected by the new validation; update all addLink calls and LinkQuery instances to use scheme-qualified URIs (e.g., replace 'alice'/'bob'/'broadcast'/'broadcast-loopback'/'signal' with 'test://alice','test://bob','test://broadcast','test://broadcast-loopback','test://signal') wherever alice.perspective.addLink, bob.perspective.addLink, and new LinkQuery({source/target/predicate: ...}) are used (references: alice.perspective.addLink, bob.perspective.queryLinks, LinkQuery, aliceP1.uuid, bobP1!.uuid), and run the test to confirm they now pass validation or explicitly skip validation if a bypass is intended.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@rust-executor/src/perspectives/perspective_instance.rs`:
- Around line 3374-3401: The number branch can return a bare numeric string when
n.as_f64() is None, producing an invalid URI; update the fallback in the
serde_json::Value::Number arm to wrap the fallback string with
Literal::from_string(...).to_url() (mirroring the String branch) so it yields a
valid URI with a scheme (handle/map_err the same way as the other conversions).
Locate the block around the match on value in perspective_instance.rs (the
serde_json::Value::Number case using n.as_f64(), Literal::from_number, and
value.to_string()) and replace the bare value.to_string() fallback with using
Literal::from_string(value.to_string()).to_url() and propagate errors as done
elsewhere.
---
Outside diff comments:
In `@tests/js/tests/neighbourhood.ts`:
- Around line 161-180: Test uses scheme-less link components which will be
rejected by the new validation; update all addLink calls and LinkQuery instances
to use scheme-qualified URIs (e.g., replace
'alice'/'bob'/'broadcast'/'broadcast-loopback'/'signal' with
'test://alice','test://bob','test://broadcast','test://broadcast-loopback','test://signal')
wherever alice.perspective.addLink, bob.perspective.addLink, and new
LinkQuery({source/target/predicate: ...}) are used (references:
alice.perspective.addLink, bob.perspective.queryLinks, LinkQuery, aliceP1.uuid,
bobP1!.uuid), and run the test to confirm they now pass validation or explicitly
skip validation if a bypass is intended.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/js/tests/neighbourhood.ts (1)
431-432: Signal / broadcast link data still uses non-URI strings — consider updating for consistency.These five
Linkobjects feedsendSignalandsendBroadcastU, which are ephemeral telepresence operations that bypass the four validated storage entry points (add_link,add_links,add_link_expression,link_mutations). They are therefore currently safe. However, they are the only link data in the file that haven't been migrated to URI-scheme values, and they would fail immediately if URI validation is ever extended to telepresence paths.♻️ Suggested alignment (five call sites)
- link.data = new Link({source: "alice", target: "bob", predicate: "signal"}); + link.data = new Link({source: "ad4m://alice", target: "ad4m://bob", predicate: "ad4m://signal"});- let link2 = new Link({source: "bob", target: "alice", predicate: "signal"}); + let link2 = new Link({source: "ad4m://bob", target: "ad4m://alice", predicate: "ad4m://signal"});- let link = new Link({source: "alice", target: "broadcast", predicate: "test"}); + let link = new Link({source: "ad4m://alice", target: "ad4m://broadcast", predicate: "ad4m://test"});- let link2 = new Link({source: "alice", target: "broadcast-loopback", predicate: "test"}); + let link2 = new Link({source: "ad4m://alice", target: "ad4m://broadcast-loopback", predicate: "ad4m://test"});- let link3 = new Link({source: "bob", target: "broadcast-loopback", predicate: "test"}); + let link3 = new Link({source: "ad4m://bob", target: "ad4m://broadcast-loopback", predicate: "ad4m://test"});Also applies to: 448-449, 481-481, 499-499, 514-514
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/js/tests/neighbourhood.ts` around lines 431 - 432, Update the five ephemeral Link constructions used with sendSignal/sendBroadcastU to use URI-scheme values instead of plain strings: replace Link({source: "alice", target: "bob", predicate: "signal"}) and similar occurrences with canonical URIs (e.g., "did:example:alice", "did:example:bob", "urn:predicate:signal" or your project's canonical scheme), and update the ExpressionProof constructor arguments ("sig", "key") to corresponding URI-formatted identifiers (e.g., "urn:proof:sig", "urn:key:..."). Locate the Link instantiations and ExpressionProof usages around the sendSignal and sendBroadcastU call sites and apply the same URI migration at all five places noted so they conform to your URI validation rules.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@rust-executor/src/perspectives/perspective_instance.rs`:
- Around line 3374-3403: Update the URI-scheme regex to be RFC 3986 compliant by
removing the underscore from the character class (change
r"^[a-zA-Z][a-zA-Z0-9+\-._]*:" to r"^[a-zA-Z][a-zA-Z0-9+\-\.]*:") wherever the
static URI_SCHEME_RE is defined (the one used in the match for value in
perspective_instance.rs and the one in Link::validate in types.rs), and replace
the default non-string/non-number branch in the match that returns
value.to_string() with calling Literal::from_json(value).to_url() (propagating
the existing map_err pattern) so non-string JSON values are encoded safely; also
apply the same regex fix inside Link::validate to keep validation consistent
with RFC 3986.
---
Nitpick comments:
In `@tests/js/tests/neighbourhood.ts`:
- Around line 431-432: Update the five ephemeral Link constructions used with
sendSignal/sendBroadcastU to use URI-scheme values instead of plain strings:
replace Link({source: "alice", target: "bob", predicate: "signal"}) and similar
occurrences with canonical URIs (e.g., "did:example:alice", "did:example:bob",
"urn:predicate:signal" or your project's canonical scheme), and update the
ExpressionProof constructor arguments ("sig", "key") to corresponding
URI-formatted identifiers (e.g., "urn:proof:sig", "urn:key:..."). Locate the
Link instantiations and ExpressionProof usages around the sendSignal and
sendBroadcastU call sites and apply the same URI migration at all five places
noted so they conform to your URI validation rules.
Add Link::validate() that enforces RFC 3986 scheme presence
([a-zA-Z][a-zA-Z0-9+-.]*:) on source, target, and predicate before
any link is stored. Called at all four entry points: add_link,
add_links, add_link_expression, and link_mutations. Any invalid link
now returns an error immediately instead of silently corrupting the
perspective graph.
Summary by CodeRabbit
Bug Fixes
Tests