feat: adaptive learning system — skill synthesis, user profiles, session search#1187
feat: adaptive learning system — skill synthesis, user profiles, session search#1187smkrv wants to merge 15 commits intonearai:mainfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a significant 'level-up' to the project by integrating an adaptive learning system. This system empowers the agent to autonomously learn from its interactions, personalize user experiences through encrypted profiles, and efficiently retrieve past conversational context. The changes are designed to make the agent smarter and more adaptable, while rigorously maintaining security and privacy standards through multiple layers of validation and encryption. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This is an impressive and ambitious pull request that introduces a sophisticated adaptive learning system. The architecture is well-designed with clear separation of concerns between skill synthesis, user profiles, and session search. The focus on security is particularly noteworthy, with multiple layers of defense including content wrapping, threat scanning, structural validation, and strong encryption. The code is high quality, well-documented, and includes good test coverage. I've identified a couple of potential race conditions that could lead to data corruption or bypassing resource limits, but overall this is an excellent contribution.
|
Re: TOCTOU in Good catch on the race condition — this is a real concern at scale. Here's the current mitigation and why a per-user mutex isn't the right fit yet: Current state:
Why not per-user mutex:
What we do have:
Happy to add the atomic DB-level constraint as a follow-up if the team thinks the current guardrails aren't sufficient. |
zmanian
left a comment
There was a problem hiding this comment.
Review: REQUEST CHANGES
Strong implementation quality and well-thought-out security architecture across ~4,500 lines. Three independent subsystems (skill synthesis, user profiles, session search), all disabled by default. But functional gaps need addressing.
What's Good
- Architecture: Standalone store traits (
SessionSearchStore,UserProfileStore,LearningStore) rather than extending theDatabasesupertrait -- excellent decision, avoids breaking existing implementations. - Dual-backend: Both PostgreSQL + libSQL correctly implemented with proper dialect handling (FTS5 with sync triggers vs tsvector GENERATED ALWAYS).
- Security defense-in-depth:
wrap_external_content()on user data,scan_content_for_threats()with NFKC normalization + zero-width stripping, per-fact HKDF-derived AES-256-GCM encryption,ApprovalRequirement::Alwaysprevents LLM self-approval,SkillSource::SynthesizedforcesInstalledtrust level. - Code quality: No
.unwrap()in production code, properthiserrortypes, good unit test coverage (46+ tests).
Critical
C1: profile_clear is N+1 queries
ProfileClearTool::execute() loads all facts then calls remove_fact() one-by-one. With 100 facts (the max), that's 101 DB round-trips. Add a delete_profile_facts_by_category() method to UserProfileStore.
C2: store_fact TOCTOU on max_facts limit
Count check and insert aren't atomic. Concurrent distillation tasks (via tokio::spawn) could race past the limit. Fix with a per-user lock or DB-level constraint.
Important
I1: Session summaries are never populated
upsert_session_summary() exists in both backends but nothing calls it anywhere in the diff. The session_search tool will always return empty results. Either add the summary writer (e.g., on session end/compaction) or remove the tool and note it as "coming in a follow-up."
I2: Dead code from path traversal fix
is_safe_skill_name() and old auto_dir.join(&skill.skill_name) remain after switching to UUID-based directory names. Clean up.
I3: FNV-1a content hash isn't collision-resistant
Used for skill dedup unique index (idx_synthesized_skills_dedup). Two different skill contents could collide and silently fail to record. Replace with SHA-256 (already a transitive dep).
I4: No DB integration tests
Good unit tests but no round-trip tests for the 3 new store traits. Project convention (src/db/CLAUDE.md) calls for integration tests using LibSqlBackend::new_memory().
I5: Unbounded tokio::spawn for distillation
No JoinHandle tracking or cancellation token. Orphaned on agent shutdown (bounded by 120s LLM timeout, but still a leak).
Suggestions
Cargo.lockcontains unrelatedwindows-sysversion downgrades -- split out or confirm intentional..gitignoreaddingdocs/plans/would affect all maintainers -- discuss whether this is desired.- Consider splitting: session search (incomplete) could be a separate PR from skill synthesis + profile engine.
- Add
clear_profile(user_id, agent_id)method toUserProfileStorefor GDPR "forget me" support.
…HA-256, tests Addresses zmanian's CHANGES_REQUESTED review on PR nearai#1187: Critical: - C1: Replace N+1 profile_clear with batch DELETE via new `delete_profile_facts_by_category()` and `clear_profile()` methods on UserProfileStore trait (both PostgreSQL and libSQL backends) - C2: Fix TOCTOU race on max_facts limit with per-user tokio::Mutex in EncryptedProfileEngine (safety scan moved outside lock) Important: - I1: Defer session_search tool registration (DB infra stays, tool awaits session summary writer in follow-up PR) - I3: Replace FNV-1a 64-bit hash with SHA-256 for collision-resistant content deduplication in skill synthesis worker - I4: Add 19 integration tests for UserProfileStore, LearningStore, and SessionSearchStore using file-based libSQL - I5: Return JoinHandle from spawn_learning_worker, await on shutdown for graceful drain of in-flight work Suggestions: - S1: Remove broad `docs/plans/` from .gitignore - S2: Add GDPR `clear_profile()` for full user data erasure - profile_clear tool now supports 3 modes: specific fact, category, or full wipe (omit all params). Zero-count returns explicit "nothing to remove" message.
Add V13 PostgreSQL and libSQL migrations with three new tables: - session_summaries (FTS + vector search for session history) - user_profile_facts (AES-256-GCM encrypted user profile storage) - synthesized_skills (audit log with approval workflow) Add standalone store traits (SessionSearchStore, UserProfileStore, LearningStore) injected via Arc<dyn T> through DatabaseHandles — NOT added to Database supertrait to preserve backward compatibility. Implement both PostgreSQL and libSQL backends with full dialect parity including CHECK constraints, FTS5 triggers, composite indexes. Security: IDOR protection (user_id filtering on all get/update methods), FTS5 query sanitization, proper error propagation (no unwrap_or). Add LearningConfig and UserProfileConfig with resolve() pattern, integrated into Config::build() and Config::for_testing().
New src/learning/ module with: - PatternDetector: identifies synthesis candidates (complex tool chains, novel combinations, user-requested, high quality completions) - SkillValidator: structural + safety validation of generated SKILL.md (parser, threat scanning, size limits, description/name checks) - SkillSynthesizer: trait + LLM-powered implementation with system/user message separation and wrap_external_content on all user-origin data - LearningEvent: typed event for background worker integration Security additions to ironclaw_safety crate: - scan_content_for_threats(): regex-based threat pattern scanner with real NFKC unicode normalization (unicode-normalization crate) and zero-width character stripping - Patterns: prompt injection, credential exfiltration, data theft, deception, SSH backdoor, destructive commands, secret references, role manipulation, prompt delimiter injection 19 learning tests + 90 safety tests pass. Clippy 0 warnings.
New src/user_profile/ module with: - UserProfile, ProfileFact, FactCategory, FactSource types with prompt formatting (BTreeMap sorted categories, char-boundary safe truncation, capitalized headings) - EncryptedProfileEngine: trait + AES-256-GCM implementation via SecretsCrypto (HKDF per-fact salt, safety scan on key+value) - ProfileDistiller: LLM-powered fact extraction from conversations with system/user message separation, wrap_external_content on both user messages AND existing profile facts, key format validation (alphanumeric+underscore, max 64), value length limit (512 chars), per-run fact cap (5), byte limit on input (4KB), threat scanning Security: existing profile facts wrapped before LLM injection, first message always included even if exceeding byte limit, fact key/value validation prevents injection through stored data. 11 tests (types roundtrip, truncation, parsing, safety rejection). Clippy 0 warnings.
New tools for the adaptive learning system: - session_search: FTS search over past conversation summaries - skill_list_pending: list synthesized skills awaiting approval - skill_approve: approve/reject synthesized skills with file write Security hardening: - Path traversal prevention: is_safe_skill_name() validation + starts_with() defense-in-depth check on resolved path - PROTECTED_TOOL_NAMES: all 3 tools registered to prevent shadowing - ApprovalRequirement::Always only for "accept", "reject" is safe - tokio::fs for non-blocking file I/O (not std::fs) - Errors propagated to user (no silent failure on disk write) Added register_learning_tools() to ToolRegistry.
SkillSource::Synthesized variant for auto-synthesized skills: - discover_all() scans installed_skills/auto/ with Installed trust - validate_remove() allows removal of synthesized skills - CLI format_source() displays "synthesized" label Profile management tools (profile_view, profile_edit, profile_clear): - View all learned facts with category/key/value/confidence - Manually add facts with Explicit source and max confidence - Remove specific facts by category/key Dispatcher integration: - AgentDeps extended with learning_tx, profile_engine, user_profile_config - Profile injection into system prompt via Reasoning::append_system_context() - Double safety: sanitize_tool_output + scan_content_for_threats on composed text - All AgentDeps construction sites updated (main, testing, dispatcher tests, test rigs) PROTECTED_TOOL_NAMES updated with all 6 new tools. Clippy 0 warnings. 3042 tests pass.
Background worker (spawn_learning_worker): - Bounded mpsc channel (capacity 32) for LearningEvent reception - PatternDetector evaluation → LLM synthesis → SkillValidator check - Skills failing validation are discarded (not written to DB) - Graceful shutdown on sender drop Heuristic quality_score (no LLM call required): - FNV-1a 64-bit hash for content deduplication - Base 50/20 + diversity/turn/volume bonuses, capped at 100 Full wiring in main.rs and app.rs: - AppComponents extended with learning store handles - learning_tx spawned when LEARNING_ENABLED=true - profile_engine created when USER_PROFILE_ENABLED=true + master key Security hardening: - skill_approve: blocks skills with safety_scan_passed=false - skill_approve: re-validates content before writing to disk - format_for_prompt: strips newlines from fact values - profile_edit: redacts value from ToolOutput (SSE/log safe) - FTS query length limit (512 chars) prevents DoS 4 unit tests for quality scoring. Clippy 0 warnings.
…tcher events SkillStatus enum (replaces string literals): - Typed enum with Pending/Accepted/Rejected variants - as_str()/from_str_opt() for DB serialization - All trait signatures, implementations, tools, and worker updated libSQL transaction safety: - upsert_session_summary: INSERT + SELECT-back wrapped in transaction - upsert_profile_fact: INSERT + SELECT-back wrapped in transaction Dispatcher integration: - LearningEvent sent after successful turn via try_send (non-blocking) - Tool names extracted from reasoning context tool_calls - ProfileDistiller auto-called in background tokio::spawn task - Profile facts extracted and stored after each turn PostgreSQL FTS query validation (parity with libSQL): - Empty query rejection and 512-char length limit Clippy 0 warnings. 3044 tests pass.
Gemini Code Assist correctly identified that the 8-character hash prefix used for auto-skill directory names could collide. Using the skill's UUID (guaranteed unique) as the directory name eliminates this risk. The skill's activation name is read from SKILL.md frontmatter, not the directory name. Removed is_safe_skill_name() — no longer needed since UUID paths are inherently safe (no traversal possible).
d42a098 to
8812f11
Compare
…HA-256, tests Addresses zmanian's CHANGES_REQUESTED review on PR nearai#1187: Critical: - C1: Replace N+1 profile_clear with batch DELETE via new `delete_profile_facts_by_category()` and `clear_profile()` methods on UserProfileStore trait (both PostgreSQL and libSQL backends) - C2: Fix TOCTOU race on max_facts limit with per-user tokio::Mutex in EncryptedProfileEngine (safety scan moved outside lock) Important: - I1: Defer session_search tool registration (DB infra stays, tool awaits session summary writer in follow-up PR) - I3: Replace FNV-1a 64-bit hash with SHA-256 for collision-resistant content deduplication in skill synthesis worker - I4: Add 19 integration tests for UserProfileStore, LearningStore, and SessionSearchStore using file-based libSQL - I5: Return JoinHandle from spawn_learning_worker, await on shutdown for graceful drain of in-flight work Suggestions: - S1: Remove broad `docs/plans/` from .gitignore - S2: Add GDPR `clear_profile()` for full user data erasure - profile_clear tool now supports 3 modes: specific fact, category, or full wipe (omit all params). Zero-count returns explicit "nothing to remove" message.
…HA-256, tests Addresses zmanian's CHANGES_REQUESTED review on PR nearai#1187: Critical: - C1: Replace N+1 profile_clear with batch DELETE via new `delete_profile_facts_by_category()` and `clear_profile()` methods on UserProfileStore trait (both PostgreSQL and libSQL backends) - C2: Fix TOCTOU race on max_facts limit with per-user tokio::Mutex in EncryptedProfileEngine (safety scan moved outside lock) Important: - I1: Defer session_search tool registration (DB infra stays, tool awaits session summary writer in follow-up PR) - I3: Replace FNV-1a 64-bit hash with SHA-256 for collision-resistant content deduplication in skill synthesis worker - I4: Add 19 integration tests for UserProfileStore, LearningStore, and SessionSearchStore using file-based libSQL - I5: Return JoinHandle from spawn_learning_worker, await on shutdown for graceful drain of in-flight work Suggestions: - S1: Remove broad `docs/plans/` from .gitignore - S2: Add GDPR `clear_profile()` for full user data erasure - profile_clear tool now supports 3 modes: specific fact, category, or full wipe (omit all params). Zero-count returns explicit "nothing to remove" message.
8812f11 to
dec5119
Compare
…HA-256, tests Addresses zmanian's CHANGES_REQUESTED review on PR nearai#1187: Critical: - C1: Replace N+1 profile_clear with batch DELETE via new `delete_profile_facts_by_category()` and `clear_profile()` methods on UserProfileStore trait (both PostgreSQL and libSQL backends) - C2: Fix TOCTOU race on max_facts limit with per-user tokio::Mutex in EncryptedProfileEngine (safety scan moved outside lock) Important: - I1: Defer session_search tool registration (DB infra stays, tool awaits session summary writer in follow-up PR) - I3: Replace FNV-1a 64-bit hash with SHA-256 for collision-resistant content deduplication in skill synthesis worker - I4: Add 19 integration tests for UserProfileStore, LearningStore, and SessionSearchStore using file-based libSQL - I5: Return JoinHandle from spawn_learning_worker, await on shutdown for graceful drain of in-flight work Suggestions: - S1: Remove broad `docs/plans/` from .gitignore - S2: Add GDPR `clear_profile()` for full user data erasure - profile_clear tool now supports 3 modes: specific fact, category, or full wipe (omit all params). Zero-count returns explicit "nothing to remove" message.
dec5119 to
dc26ad1
Compare
smkrv
left a comment
There was a problem hiding this comment.
Thanks for the thorough review @zmanian — really appreciate the level of detail here. Every point was valid and I've addressed all of them in dc26ad1 (rebased on current main).
Critical
C1: profile_clear N+1 queries — Fixed. Added delete_profile_facts_by_category() and clear_profile() to the UserProfileStore trait with single-query batch DELETE in both PostgreSQL and libSQL backends. ProfileClearTool now calls the batch method directly — no more load-then-loop. Also upgraded the tool to support 3 modes: specific fact (category + key), category wipe (category only), and full erasure (omit both — the GDPR "forget me" you suggested in S2).
C2: store_fact TOCTOU on max_facts — Fixed with per-user tokio::Mutex in EncryptedProfileEngine. The check-and-write is now serialized per user_id:agent_id pair. Safety scan runs outside the lock (no DB access needed), lock is acquired only for the count check + upsert. The outer std::sync::Mutex on the lock map is held only briefly for HashMap get_or_insert — no .await under it. Added a NOTE about unbounded map growth being acceptable for single-user but needing LRU eviction before multi-user.
Important
I1: Session summaries never populated — Removed SessionSearchTool from registration. The DB schema, store traits, and implementations stay in place (they have 19 integration tests pinning their behavior) — ready for a follow-up PR that wires the summary writer on session end/compaction. Comment in register_learning_tools() documents the intent.
I2: Dead code from path traversal fix — Already cleaned up in 088f5ee (now 548ed38 after rebase). is_safe_skill_name() was removed, directory uses skill_id.to_string() (UUID).
I3: FNV-1a → SHA-256 — Replaced content_hash() with sha2::Sha256. Full 64-char hex string stored in skill_content_hash for the dedup unique index. The 8-char prefix in auto-{hash[..8]} is cosmetic (skill display name only).
I4: No DB integration tests — Added 19 tests in tests/learning_store_integration.rs covering all 3 store traits: UserProfileStore (10 tests including upsert, conflict, category filter, batch delete, clear, user isolation, edge cases), LearningStore (5 tests including IDOR protection, content hash dedup, status transitions), SessionSearchStore (4 tests including FTS search, upsert-updates-existing, user isolation). Uses file-based libSQL with tempfile.
I5: Unbounded tokio::spawn — spawn_learning_worker() now returns (Sender, JoinHandle). The handle is awaited in the shutdown sequence after agent.run() returns, ensuring in-flight synthesis completes before exit.
Suggestions
Cargo.lock / windows-sys — Resolved by rebase onto current main (139 commits ahead). The uds_windows bump was auto-dropped as already upstream.
.gitignore docs/plans/ — Removed. Only docs/architecture/adaptive-learning/ remains (local-only design docs).
Split session search — Agreed it's incomplete without the summary writer. Kept the infrastructure (schema + traits + tests) since it shares the V14 migration, but deregistered the tool. Follow-up PR will wire the writer.
GDPR clear_profile — Done as part of C1. clear_profile(user_id, agent_id) added to trait + both backends + exposed as mode 3 in profile_clear tool. Note: currently only covers user_profile_facts — full erasure across session_summaries + synthesized_skills is tracked as follow-up.
CI is fully green (26/26 checks passing). Let me know if anything else needs attention!
…n conflict Upstream merged V13__owner_scope_notify_targets.sql while our branch had V13__learning_system.sql. Refinery crashes on duplicate version numbers. Rename to V14 to match the libSQL incremental migration (already V14 after rebase). Note: existing databases that ran V13 as learning_system need: UPDATE __refinery_schema_history SET version = 14 WHERE name = 'learning_system';
|
Super interesting, been thinking about similar. Need to figure out how to benchmark this. Few comments/questions:
Also high level it feels like there needs to be a retrospective after skills were used - to suggest improving existing skills. |
|
Thanks! Glad this resonates. Profiler merge - yeah, makes total sense to wire onboarding profiler into UserProfileEngine as the initial seed. Can do as a follow-up once this lands. Why encrypt profiles - workspace memory is stuff the agent writes on its own. Profile is structured personal data the LLM extracts (timezone, expertise, habits) - felt like it deserves a separate layer even if DB is compromised. But honestly if the team thinks it's overkill, happy to drop encryption and treat it like regular workspace data, the rest of the pipeline doesn't care either way. Skill retrospective - 100%, that's exactly where I want to take this. I'm prepping a phase 2 that would close the feedback loop - track whether synthesized skills actually get picked up by the scoring pipeline, flag underperformers, suggest improvements. If this PR lands I'll spec it out and open a follow-up. Re benchmarking: happy to help set something up once we figure out what "better" looks like here :D |
zmanian
left a comment
There was a problem hiding this comment.
Ambitious feature with solid security design, but the scope and a few critical issues need addressing before merge.
Strengths:
- Defense-in-depth security: 6 layers (content wrapping, threat scanning, structural validation, AES-256-GCM encryption, trust boundaries, user approval gate)
- Dual-backend support (PostgreSQL + libSQL V13 migration)
- Unicode NFKC normalization + zero-width character stripping prevents homoglyph/bypass attacks
- Synthesized skills forced to "Installed" (read-only tools) -- correct trust model
- 648-line integration test validates both backends
Required changes:
-
CRITICAL: Silent event loss --
try_send()on the learning event channel drops events when full (capacity 32) with no logging or alerting. Usesend().awaitwith backpressure, or at minimum log warnings when events are dropped. -
Profile key derivation -- HKDF info parameter should include
user_idso per-user profiles get distinct derived keys:HKDF-Expand(PRK, info=user_id||fact_key||salt) -
Unbounded task spawning -- Profile distiller spawns
tokio::spawnevery N turns without awaiting or tracking handles. Under load, tasks accumulate. Add a shutdown hook or track JoinHandles. -
Scope: strongly recommend splitting into 3-5 PRs:
- DB schema + store trait definitions
- Skill synthesis + validation
- User profile engine + encryption
- Session search
- Tool wiring + e2e tests
-
Vector embedding column lacks dimension constraint -- document why unbounded is safe or add constraint.
Non-blocking suggestions:
- Skill synthesis heuristic thresholds (3+ tool calls, 75+ quality score) should be configurable
- Add e2e tests for the skill approval workflow and profile injection into prompts
|
Maintainer note: This PR introduces significant architectural additions to the codebase. Before it can be considered for merge, it will need collective discussion among maintainers to align on the design direction, scope, and integration approach. The code review feedback above covers implementation-level concerns, but the broader question of whether this feature belongs in core (vs. a plugin/extension) and how it fits the project's architecture roadmap needs to be discussed first. |
…aration, bounded distillation
Addresses zmanian's second review (review #3986204426):
1. CRITICAL: try_send() silent event loss — now logs warning when learning
events are dropped due to full channel (capacity 32).
2. HKDF key derivation — profile encryption now includes user_id and
agent_id in the HKDF info parameter for proper domain separation:
HKDF-Expand(PRK, info="ironclaw-profile-v1:{user_id}:{agent_id}").
Added encrypt_with_info/decrypt_with_info to SecretsCrypto. Existing
encrypt/decrypt unchanged (backward compatible for secrets store).
3. Unbounded task spawning — profile distiller now bounded by a Semaphore(1)
in AgentDeps. If a previous distillation is still running, the new one
is skipped with a debug log. Prevents task accumulation under load.
4. Vector embedding dimension — expanded migration comment documenting why
the column is intentionally unbounded (matches V9 workspace pattern,
supports 768-3072 dim models, sequential search fast enough).
5. Configurable thresholds — already implemented via LearningConfig env
vars (LEARNING_MIN_TOOL_CALLS, LEARNING_MIN_UNIQUE_TOOLS, etc.).
… for encrypt_with_info
…uity Replace colon-delimited format with null-terminated prefix + length-prefixed user_id and agent_id. Prevents theoretical key collision when user_id or agent_id contain the delimiter character.
…stead of panic Replace `as u32` silent truncation with `u32::try_from().map_err()?` to comply with project rule: no .expect() in production code. Both callers already return Result, so error propagation is natural.
|
Thanks for the note @zmanian. Totally understand the concern about scope, happy to talk through this. Re core vs plugin/extension - the learning system is pretty deeply coupled to internal APIs that aren't accessible from the plugin boundary:
None of these have plugin-facing equivalents today. Moving this to a plugin would require exposing a bunch of new extension points that don't exist yet, and the result would be a "plugin" that's really just core code behind an abstraction barrier that only it uses. Felt like the wrong tradeoff. That said, totally open to splitting the PR to make review easier. I was thinking something like:
Session search is already deregistered (summaries aren't populated yet), so it goes naturally with the schema PR. Would this breakdown work for you? Happy to adjust based on what makes the review most manageable. In the meantime I pushed fixes for all 5 items from your second review (event logging, HKDF domain separation with user_id in length-prefixed info, bounded distillation via semaphore, embedding dimension documented, thresholds were already configurable via env vars). |
…on in our TLS usage)
|
Hey @smkrv, this is my jam! I spent a considerable amount of time thinking about this here https://github.com/serrrfirat/synapse-daemon. I think your design overall is great! Couple of questions:
Hermes agent https://github.com/0xNyk/awesome-hermes-agent also does something similar, would be good to check that out. Would love to contribute to this feature, I think this is a big differentiator! |
|
Hey @serrrfirat! Checked out synapse-daemon, our pipelines have more in common than I expected. Heuristics vs LLM - you're right, Deduplication - biggest gap honestly. We only have SHA-256 exact-match dedup, no semantic grouping. The embedding infra is partially there ( Cognitive load - 100% make-or-break. Right now it's just blunt caps ( Re Hermes - the dojo concept of auto-iterating underperforming skills is exactly the feedback loop ilblackdragon was asking about above, phase 2 territory. Happy to walk through the codebase if you want to pick something up! |
Review: Integration Path with v2 EngineGreat work on this PR — the adaptive learning system has solid ideas and strong implementation, especially the encryption, FTS, and validation layers. Here's some feedback on how this could align with the v2 engine direction. What's excellent here
Integration concern: v1 vs v2 targetingThe main challenge is that this targets v1 integration points (
Suggested integration pathRather than merging the v1 hooks (which will become dead code as v2 takes over), we'd love to see the novel contributions ported into v2 primitives:
Concrete next steps
This way the implementation stays relevant as we migrate to v2, and the strongest parts (crypto, FTS, validation) become shared infrastructure rather than v1-only features. Would be happy to pair on the v2 integration if you're interested — the core logic is solid and we'd love to see it land in a durable form. Keep going! 🚀 |
Hey folks, don't think I've lost my mind here, but I'd like to propose this level-up PR with an adaptive learning mechanism that will make the whole project a bit smarter and the routine a bit easier.
I've tried to test this as thoroughly as I could - ran it through 30+ full review-and-fix iterations across Opus and Codex (not code generation - testing, reviewing, and fixing), with multi-agent review panels (Architecture, Code Style, Security, Product Owner, Lead Dev, UI/UX, CPO, and others - all Blocking/High/Critical findings resolved), and also tested locally on my own agents in my own environment.
But I could have missed things (I definitely have :D), including edge cases that only show up under real-world conditions at scale.
Fresh eyes are very welcome.
Would really appreciate help with testing and reviewing all of this - no matter how good LLM-powered code generation and review gets, pulling something like this off solo is tough :) Would be awesome to see this feature get some life and evolve further as a real level-up for the project. And yes,
I swear I'm not crazy — I just went way too deep down this rabbit hole and at some point there was no turning back, so here we are.
What this does
This PR adds an Adaptive Learning System — three independently testable subsystems that let IronClaw autonomously learn from experience while architecturally guaranteeing that learned knowledge cannot leak secrets.
The Three Pillars
1. Autonomous Skill Synthesis (
src/learning/)The agent watches its own successful interactions. When it completes a complex multi-tool task (3+ tool calls with quality score ≥75, diverse tool combinations with ≥2 unique tools, or high quality completions scoring 90+), a background worker synthesizes a reusable SKILL.md from the interaction — via LLM, not templates. The generated skill passes through 6 layers of security validation before the user even sees it. The user must explicitly approve before it goes live. No auto-deployment, ever.
2. Encrypted User Profile Engine (
src/user_profile/)An evolving user model built from conversations. The
ProfileDistillerextracts durable facts (timezone, expertise, communication style) via LLM, encrypts them with AES-256-GCM (per-fact HKDF-derived keys, random 32-byte salt, master key from OS keychain orSECRETS_MASTER_KEYenv var — same crypto as credential storage), and injects them into the system prompt on next turn. The agent remembers you across sessions without storing anything in plaintext.3. Session Search (
src/db/libsql/session_search.rs,src/db/postgres.rs)FTS5 (libSQL) / tsvector (PostgreSQL) full-text search over conversation summaries with vector embedding support for future hybrid search. The
session_searchtool lets the agent recall prior work, decisions, and context from previous conversations.Architecture Diagrams
Full Architecture Pipeline
Skill Lifecycle State Machine
Runtime Sequence Flow
Security Architecture
Database Schema (V13)
Security Architecture (6 defense-in-depth layers)
All learned data flows through IronClaw's existing security primitives:
wrap_external_content()on every untrusted input before it reaches an LLM promptscan_content_for_threats()with NFKC unicode normalization, prompt injection patterns, role manipulation detectionSkillValidatorenforces YAML frontmatter, 16 KiB size limit, 256-char description limit, name/content/description threat scanningSkillSource::SynthesizedforcesInstalledtrust (read-only tools, treated as suggestions), path traversal preventionApprovalRequirement::Alwaysonskill_approvefor accept action — the LLM cannot self-approve its own generated skills (reject is safe and doesn't require approval)Architecture Decisions
SessionSearchStore,UserProfileStore,LearningStore) — NOT added toDatabasesupertrait. Injected viaArc<dyn T>. This keeps the existing DB implementations untouched and lets the system start without these features.Files Changed
src/learning/(7 files),src/user_profile/(5 files)session_search,skill_list_pending,skill_approve,profile_view,profile_edit,profile_clearsrc/db/mod.rswith implementations for both PostgreSQL and libSQLdispatcher.rs: profile injection viaappend_system_context()after system prompt is set, learning event emission after turn completionSkillSource::Synthesizedvariant insrc/skills/registry.rswith forcedInstalledtrust