Skip to content

Latest commit

 

History

History
343 lines (255 loc) · 22.6 KB

File metadata and controls

343 lines (255 loc) · 22.6 KB

Acknowledgements

Loki Mode stands on the shoulders of giants. This project incorporates research, patterns, and insights from the leading AI labs, academic institutions, and practitioners in the field.


Research Labs

Anthropic

Loki Mode is built for Claude and incorporates Anthropic's cutting-edge research on AI safety and agent development.

Paper/Resource Contribution to Loki Mode
Constitutional AI: Harmlessness from AI Feedback Self-critique against principles, revision workflow
Building Effective Agents Evaluator-optimizer pattern, parallelization, routing
Claude Code Best Practices Explore-Plan-Code workflow, context management
Simple Probes Can Catch Sleeper Agents Defection probes, anomaly detection patterns
Alignment Faking in Large Language Models Monitoring for strategic compliance
Visible Extended Thinking Thinking levels (think, think hard, ultrathink)
Computer Use Safety Safe autonomous operation patterns
Sabotage Evaluations Safety evaluation methodology
Effective Harnesses for Long-Running Agents One-feature-at-a-time pattern, Playwright MCP for E2E
Claude Agent SDK Overview Task tool, subagents, resume parameter, hooks

Google DeepMind

DeepMind's research on world models, hierarchical reasoning, and scalable oversight informs Loki Mode's architecture.

Paper/Resource Contribution to Loki Mode
SIMA 2: Generalist AI Agent Self-improvement loop, reward model training
Gemini Robotics 1.5 Hierarchical reasoning (planner + executor)
Dreamer 4: World Model Training Simulation-first testing, safe exploration
Genie 3: World Models World model architecture patterns
Scalable AI Safety via Doubly-Efficient Debate Debate-based verification for critical changes
Human-AI Complementarity for Amplified Oversight AI-assisted human supervision
Technical AGI Safety Approach Safety-first agent design

OpenAI

OpenAI's Agents SDK and deep research patterns provide foundational patterns for agent orchestration.

Paper/Resource Contribution to Loki Mode
Agents SDK Documentation Tracing spans, guardrails, tripwires
A Practical Guide to Building Agents Agent architecture best practices
Building Agents Track Development patterns, handoff callbacks
AGENTS.md Specification Standardized agent instructions
Introducing Deep Research Adaptive planning, backtracking
Deep Research System Card Safety considerations for research agents
Introducing o3 and o4-mini Reasoning model guidance
Reasoning Best Practices Extended thinking patterns
Chain of Thought Monitoring Reasoning trace monitoring
Agent Builder Safety Safety patterns for agent builders
Computer-Using Agent Computer use patterns
Agentic AI Foundation Industry standards, interoperability

Amazon Web Services (AWS)

AWS Bedrock's multi-agent collaboration patterns inform Loki Mode's routing and dispatch strategies.

Paper/Resource Contribution to Loki Mode
Multi-Agent Orchestration Guidance Three coordination mechanisms, architectural patterns
Bedrock Multi-Agent Collaboration Supervisor mode, routing mode, 10-agent limit
Multi-Agent Collaboration Announcement Intent classification, selective context sharing
AgentCore for SRE Gateway, Memory, Identity, Observability components

Key Pattern Adopted: Routing Mode Optimization - Direct dispatch for simple tasks (lower latency), supervisor orchestration for complex tasks (full coordination).


Academic Research

Multi-Agent Systems

Paper Authors/Source Contribution
Multi-Agent Collaboration Mechanisms Survey arXiv 2501.06322 Collaboration structures, coopetition
CONSENSAGENT: Anti-Sycophancy Framework ACL 2025 Findings Blind review, devil's advocate
GoalAct: Hierarchical Execution arXiv 2504.16563 Global planning, skill decomposition
A-Mem: Agentic Memory System arXiv 2502.12110 Zettelkasten-style memory linking
Multi-Agent Reflexion (MAR) arXiv 2512.20845 Structured debate, persona-based critics
Iter-VF: Iterative Verification-First arXiv 2511.21734 Answer-only verification, Markovian retry

Evaluation & Safety

Paper Authors/Source Contribution
Assessment Framework for Agentic AI arXiv 2512.12791 Four-pillar evaluation framework
Measurement Imbalance in Agentic AI arXiv 2506.02064 Multi-dimensional evaluation axes
Demo-to-Deployment Gap Stanford/Harvard Tool reliability vs tool selection

Verification & Hallucination Reduction

Paper Authors/Source Contribution
Chain-of-Verification Reduces Hallucination in LLMs Dhuliawala et al., Meta AI, 2023 4-step verification (Draft -> Plan -> Execute -> Verify), factored execution, significant hallucination reduction (23% F1 improvement, ~77% reduction in hallucinated entities)

Memory Systems

Paper Authors/Source Contribution
MemEvolve: Meta-Evolution of Agent Memory Systems Zhang et al., OPPO AI Agent Team, 2025 Modular design (Encode/Store/Retrieve/Manage), task-aware strategy selection, 17.06% improvement via meta-evolution
A-MEM: Agentic Memory for LLM Agents Xu et al., NeurIPS 2025 Zettelkasten-style atomic notes with keywords, tags, and bidirectional links; ChromaDB indexing
MemGPT: Towards LLMs as Operating Systems Packer et al., 2023 OS-inspired hierarchical memory (Core/Recall/Archival), self-editing memory via tool use, paging policies
Zep: Temporal Knowledge Graph Architecture Zep AI, 2025 Bi-temporal model (event time + ingestion time), knowledge invalidation, 94.8% DMR accuracy
SimpleMem: Efficient Lifelong Memory aiming-lab, 2026 Semantic lossless compression, online semantic synthesis, 30x token reduction, 26.4% F1 improvement
CAM: Constructivist Agentic Memory Rui et al., NeurIPS 2025 Piaget-inspired hierarchical schemata, overlapping clustering, prune-and-grow retrieval
SAGE: Self-evolving Agents with Reflective Memory 2024 Ebbinghaus forgetting curve, usage-based decay, three-agent collaboration for memory refinement
Contextual Retrieval Anthropic, 2024 Contextual BM25 + embeddings + reranking, 67% retrieval failure reduction
Memory in the Age of AI Agents (Survey) Liu et al., 2025 Forms-Functions-Dynamics taxonomy, comprehensive memory architecture survey

Industry Resources

Tools & Frameworks

Resource Contribution
Cursor - Scaling Agents Hierarchical planner-worker model, optimistic concurrency, recursive sub-planners, judge agents, scale-tested patterns (1M+ LoC projects)
NVIDIA ToolOrchestra Efficiency metrics, three-reward signal framework, dynamic agent selection
LerianStudio/ring Subagent-driven-development pattern
Awesome Agentic Patterns 105+ production patterns catalog

Best Practices Guides

Resource Contribution
Maxim AI: Production Multi-Agent Systems Correlation IDs, failure handling
UiPath: Agent Builder Best Practices Single-responsibility agents
GitHub: Speed Without Control Static analysis + AI review, guardrails

Hacker News Community

Battle-tested insights from practitioners deploying agents in production.

Discussions

Thread Key Insight
What Actually Works in Production for Autonomous Agents "Zero companies without human in the loop"
Coding with LLMs in Summer 2025 Context curation beats automatic RAG
Superpowers: How I'm Using Coding Agents Sub-agents for context isolation (Simon Willison)
Claude Code Experience After Two Weeks Fresh contexts yield better results
AI Agent Benchmarks Are Broken LLM-as-judge has shared blind spots
How to Orchestrate Multi-Agent Workflows Event-driven, decoupled coordination
Context Engineering vs Prompt Engineering Manual context selection principles

Show HN Projects

Project Contribution
Self-Evolving Agents Repository Self-improvement patterns
Package Manager for Agent Skills Skills architecture
Wispbit - AI Code Review Agent Code review patterns
Agtrace - Monitoring for AI Coding Agents Agent monitoring patterns

Individual Contributors

Special thanks to thought leaders whose patterns and insights shaped Loki Mode:

Contributor Contribution
Boris Cherny (Creator of Claude Code) Self-verification loop (2-3x quality improvement), extended thinking mode, "Less prompting, more systems" philosophy
Ivan Steshov Centralized constitution, agent lineage tracking, structured artifacts as contracts
Addy Osmani Git checkpoint system, specification-first approach, visual aids (Mermaid diagrams)
Simon Willison Sub-agents for context isolation, skills system, context curation patterns

Production Patterns Summary

Key patterns incorporated from practitioner experience:

Pattern Source Implementation
Human-in-the-Loop (HITL) HN Production Discussions Confidence-based escalation thresholds
Narrow Scope (3-5 steps) Multiple Practitioners Task scope constraints
Deterministic Validation Production Teams Rule-based outer loops (not LLM-judged)
Context Curation Simon Willison Manual selection, focused context
Blind Review + Devil's Advocate CONSENSAGENT Anti-sycophancy protocol
Hierarchical Reasoning DeepMind Gemini Orchestrator + specialized executors
Constitutional Self-Critique Anthropic Principles-based revision
Debate Verification DeepMind Critical change verification
One Feature at a Time Anthropic Harness Single feature per iteration, full verification
E2E Browser Testing Anthropic Harness Playwright MCP for visual verification
Chain-of-Verification arXiv 2309.11495 CoVe protocol in quality-gates.md
Factored Verification arXiv 2309.11495 Independent verification execution
Modular Memory Design arXiv 2512.18746 Encode/Store/Retrieve/Manage mapping in memory-system.md
Task-Aware Memory Strategy arXiv 2512.18746 Retrieval weight adjustment by task type

v3.2.0 Additions

Parallel Workflows

Resource Contribution
Claude Code Git Worktrees Parallel Claude sessions, worktree isolation pattern

Key Patterns Incorporated (v3.2.0)

Pattern Source Implementation
Git Worktree Isolation Claude Code Docs skills/parallel-workflows.md, run.sh --parallel
Parallel Testing Stream Claude Code Docs Testing worktree tracks main, continuous validation
Inter-Stream Signals Custom .loki/signals/ for feature/test/docs coordination
Auto-Merge Workflow Custom Completed features merge back automatically

v3.0.0 Additions

Agent Interoperability

Resource Contribution
Google A2A Protocol Agent Cards, capability discovery, JSON-RPC 2.0
A2A Protocol v0.3 gRPC support, security card signing, enterprise features
A2A Project GitHub Open protocol specification, SDK implementations

Agentic Patterns

Resource Contribution
Awesome Agentic Patterns 105+ production patterns catalog, feedback loops, tool patterns
Agent Orchestration Critique "Ralph Wiggum Mode" - simpler orchestration beats complex frameworks

Key Patterns Incorporated

Pattern Source Implementation
Agent Cards A2A Protocol .loki/state/agents/ capability discovery
Structured Handoffs A2A Protocol JSON message format for agent-to-agent communication
Sub-Agent Spawning awesome-agentic-patterns Task tool with focused prompts
Dual LLM Pattern awesome-agentic-patterns Opus for planning, Haiku for execution
CI Feedback Loop awesome-agentic-patterns Test results injected into retry prompts
Minimal Orchestration moridinamael Simple continuation over complex frameworks

Community Projects (Open Source Claude Code Skills)

The following open-source projects have pioneered patterns that influence or complement Loki Mode. Analyzed January 2026.

High-Impact Projects

Project Stars Key Patterns Contribution to Loki Mode
Superpowers (obra) 35K+ Two-Stage Review, TDD Iron Law, Rationalization Tables ADOPTED: Two-stage review (spec compliance THEN code quality)
agents (wshobson) 26K+ 72 plugins, 108 agents, 129 skills, Four-Tier Model Strategy Plugin marketplace architecture inspiration
claude-flow (ruvnet) 12K+ Swarm topologies (hierarchical/mesh/ring/star), Consensus algorithms (Raft, Byzantine, CRDT) Terminal-based orchestration patterns
oh-my-claudecode (Yeachan-Heo) N/A 32 agents, 35 skills, Tiered architecture (LOW/MEDIUM/HIGH), Delegation-first ADOPTED: Tiered agent escalation protocols

Specialized Skills

Project Focus Key Patterns Contribution to Loki Mode
claude-mem (thedotmack) Memory Progressive Disclosure (3-layer), SQLite + FTS5, Timeline compression ADOPTED: 3-layer memory (index -> timeline -> full)
planning-with-files (OthmanAdi) Planning Manus-style 3-file pattern, PreToolUse attention hooks ADOPTED: File-based planning persistence
claude-scientific-skills (K-Dense-AI) Scientific 140 domain-specific skills, modular organization Domain organization patterns
claude-code-guide (zebbern) Shortcuts QNEW/QCODE/QCHECK patterns, structured reports Shortcut command inspiration

Key Patterns Adopted from Community

Pattern Source Implementation in Loki Mode
Two-Stage Review Superpowers Spec compliance review BEFORE code quality review
Rationalization Tables Superpowers Explicit counters to common agent excuses/rationalizations
Progressive Disclosure Memory claude-mem 3-layer context: index -> timeline -> full details
Tiered Agent Escalation oh-my-claudecode LOW -> MEDIUM -> HIGH with explicit escalation triggers
File-Based Planning planning-with-files Persistent markdown files (task_plan.md, findings.md, progress.md)
PreToolUse Attention planning-with-files Re-read goals before actions to combat context drift
Fresh Subagent Per Task Superpowers Clean context for each major task, prevents cross-contamination

Patterns Under Evaluation

Pattern Source Status Notes
Token Economics Tracking claude-mem Evaluating discovery_tokens vs read_tokens for compression analysis
Delegation Enforcer Middleware oh-my-claudecode Evaluating Auto-inject model parameters based on task tier
Swarm Topologies claude-flow Not adopted Adds complexity beyond hierarchical orchestration
Consensus Algorithms claude-flow Not adopted Byzantine/Raft overkill for single-user autonomous operation
Shortcut Commands claude-code-guide Evaluating QNEW/QCODE/QCHECK for rapid task switching

v5.9.0 Additions

Cross-Project Learning Memory System

The Cross-Project Learning feature (v5.9.0) incorporates research from the following sources:

Resource Contribution
A-MEM Zettelkasten atomic note pattern - each learning is self-contained with keywords and tags
MemGPT Tiered memory architecture (hot/warm/cold) for efficient retrieval
Zep Temporal validity tracking (valid_from, valid_until, superseded_by)
SimpleMem MD5 hash-based deduplication at write time
SAGE Usage tracking with access counts and decay
Anthropic Contextual Retrieval Contextual prefixes for improved retrieval
Agent Memory Paper List Comprehensive survey of memory architectures

Key Patterns Incorporated (v5.9.0)

Pattern Source Implementation
JSONL Append-Only Storage SimpleMem ~/.loki/learnings/*.jsonl for efficient writes
MD5 Hash Deduplication SimpleMem Prevent duplicate entries at write time
Keyword/Tag Extraction A-MEM Auto-generated tags for filtering (planned v5.10)
Usage Tracking SAGE Access counts and timestamps (planned v5.10)
Temporal Validity Zep Track when learnings become outdated (planned v5.11)
Cross-Learning Links A-MEM Bidirectional knowledge graph (planned v6.0)
Memory Consolidation MemGPT Periodic deduplication and abstraction (planned v6.0)

Implementation Roadmap

Based on research synthesis, the following improvements are planned:

Phase 1 (v5.10.x): Deduplication improvements, usage tracking, keyword extraction Phase 2 (v5.11.x): BM25 search, contextual prefixes, temporal validity Phase 3 (v6.0.x): Zettelkasten-style links, memory tiering Phase 4 (v7.0.x): Hierarchical abstraction, consolidation pipeline


License

This acknowledgements file documents the research and resources that influenced Loki Mode's design. All referenced works retain their original licenses and copyrights.

Loki Mode itself is released under the MIT License.


Last updated: v5.9.0