fix: scan inbound messages for leaked secrets#433
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the application's security posture by implementing inbound secret detection for user messages. By scanning for sensitive information like API keys and tokens before messages are processed by the LLM, it prevents potential data exposure and eliminates error loops that previously occurred when the LLM echoed secrets back. This ensures a more robust and secure interaction flow. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a security check to scan inbound user messages for secrets like API keys. This is a valuable addition to prevent secrets from being sent to the LLM, which could lead to them being echoed back and triggering outbound leak detection loops. The implementation in thread_ops.rs is correct. In safety/mod.rs, I've suggested a refactoring to simplify the logic, remove code duplication, and provide a more consistent and helpful warning message to the user.
| pub fn scan_inbound_for_secrets(&self, input: &str) -> Option<String> { | ||
| match self.leak_detector.scan_and_clean(input) { | ||
| Ok(cleaned) if cleaned != input => { | ||
| // The leak detector found and redacted something | ||
| Some( | ||
| "Your message appears to contain a secret (API key, token, or credential). \ | ||
| For security, it was not sent to the AI. Please remove the secret and try again. \ | ||
| To store credentials, use the setup form or `ironclaw config set <name> <value>`." | ||
| .to_string(), | ||
| ) | ||
| } | ||
| Err(_) => { | ||
| // The leak detector blocked the content entirely | ||
| Some( | ||
| "Your message was blocked because it contains what appears to be a secret. \ | ||
| Please remove credentials from your message and try again." | ||
| .to_string(), | ||
| ) | ||
| } | ||
| _ => None, // Clean input | ||
| } | ||
| } |
There was a problem hiding this comment.
The logic for handling redacted secrets (Ok(cleaned) if cleaned != input) and blocked secrets (Err(_)) is duplicated, and the user-facing warning message for blocked secrets is less helpful than the one for redacted secrets.
To improve code clarity and provide a consistent, helpful message to the user, you can refactor this to determine if a secret was found and then return a single, unified warning message.
pub fn scan_inbound_for_secrets(&self, input: &str) -> Option<String> {
let found_secret = match self.leak_detector.scan_and_clean(input) {
Ok(cleaned) if cleaned != input => true, // Redacted
Err(_) => true, // Blocked
_ => false, // Clean or warn-only
};
if found_secret {
Some(
"Your message appears to contain a secret (API key, token, or credential). For security, it was not sent to the AI. Please remove the secret and try again. To store credentials, use the setup form or `ironclaw config set <name> <value>`."
.to_string(),
)
} else {
None
}
}Both the detected-secret and error branches now show the same actionable message guiding users to remove secrets and use the config system instead. Addresses Gemini review feedback on PR nearai#433. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…earai#393) Add scan_inbound_for_secrets() to SafetyLayer that reuses the existing leak detector on user input. Wire it into thread_ops.rs after the policy check so messages containing API keys or tokens are rejected early, preventing the LLM from echoing them back and triggering outbound leak-detection error loops. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Both the detected-secret and error branches now show the same actionable message guiding users to remove secrets and use the config system instead. Addresses Gemini review feedback on PR nearai#433. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1a618db to
3b8b042
Compare
* fix: scan inbound messages for leaked secrets before LLM processing (nearai#393) Add scan_inbound_for_secrets() to SafetyLayer that reuses the existing leak detector on user input. Wire it into thread_ops.rs after the policy check so messages containing API keys or tokens are rejected early, preventing the LLM from echoing them back and triggering outbound leak-detection error loops. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: unify inbound secret scan warning messages Both the detected-secret and error branches now show the same actionable message guiding users to remove secrets and use the config system instead. Addresses Gemini review feedback on PR nearai#433. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* fix: scan inbound messages for leaked secrets before LLM processing (nearai#393) Add scan_inbound_for_secrets() to SafetyLayer that reuses the existing leak detector on user input. Wire it into thread_ops.rs after the policy check so messages containing API keys or tokens are rejected early, preventing the LLM from echoing them back and triggering outbound leak-detection error loops. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: unify inbound secret scan warning messages Both the detected-secret and error branches now show the same actionable message guiding users to remove secrets and use the config system instead. Addresses Gemini review feedback on PR nearai#433. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Summary
scan_inbound_for_secrets()toSafetyLayerthat reuses the existingLeakDetectoron user inputthread_ops.rsafter policy check to reject messages containing API keys/tokens before they reach the LLMTest plan
sk-proj-abc123...) and verify it is rejected with a user-friendly warningcargo test safety)🤖 Generated with Claude Code