Skip to content

TheWolfOfWalmart/vault-graphrag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

vault-graphrag

22% better retrieval than any single search strategy — a hybrid GraphRAG server that makes your second brain actually searchable.

If you've built a knowledge base in Obsidian — notes, projects, research, meeting logs — you've probably hit the same wall everyone hits: you know the answer is in there somewhere, but search can't find it. You spend more time looking for notes than using them. The system that was supposed to make you smarter starts feeling like a cluttered drawer.

vault-graphrag fixes this by running four search strategies in parallel and fusing the results. It's exposed as a single MCP tool — one call, best answer, regardless of which strategy found it.

Why Second Brains Break Down

The promise of a "second brain" is that you capture knowledge once and retrieve it when you need it. The reality is that retrieval is where these systems fail. You save hundreds of notes, build links between them, and then can't find the one you need because:

  • You search for "motor settings" but the note is titled "VFD Configuration" — keyword search misses it
  • You search semantically for "project planning" and get conceptually similar notes, but not the specific project note that's two links away from a related note
  • You browse backlinks manually, but there's no way to rank which linked note is actually relevant to your question

Each search strategy works for some queries and fails for others. The problem isn't that your notes are disorganized — it's that no single search method can handle the variety of ways you need to find things.

This problem gets worse with AI agents. An agent searching your vault on your behalf has to guess which strategy to use, make multiple tool calls, and piece together results from different tools. Pick wrong and the answer doesn't come back at all.

How vault-graphrag Solves This

The vault

Obsidian vault graph view

Obsidian stores notes as plain Markdown files locally. What makes it powerful for knowledge management is linking — any note can reference another with [[WikiLinks]], and those links are bidirectional. Over time, your notes form a knowledge graph: a web of connections that you authored, where the links themselves carry meaning. A link from a project note to [[stepper motors]] is a claim: "this concept is relevant here."

This is the principle behind the Zettelkasten method — the idea that knowledge lives in the relationships between notes, not just in the notes themselves. vault-graphrag is built to exploit that structure.

The search

vault_search runs four retrieval strategies in parallel, weights them based on what kind of answer the query needs, and returns one fused result set. The caller declares an intent — what kind of search this is — and the server handles the routing:

Intent When to use it Strategies emphasized
factual_lookup Looking for a specific fact, name, or setting Keyword search + memory recall
context_load Loading everything related to a project or topic WikiLink graph traversal
conceptual Exploring ideas or finding thematically related notes Semantic similarity
backlink Finding what links to a specific note Reverse WikiLink scan
serendipity Open-ended exploration, discovering unexpected connections Semantic + graph equally

If the caller doesn't specify an intent, a local LLM classifies the query automatically.

Long-term memory (optional)

Hindsight is a long-term memory service for AI agents. It ingests conversations over time, extracts structured facts (decisions made, preferences stated, events that happened), and builds a persistent knowledge graph from them. vault-graphrag can optionally query Hindsight as a fourth retrieval channel, so a single search returns both your vault notes and facts the agent remembers from past conversations — even if those facts were never written down as notes.

Does it actually work better?

Yes. Evaluated against 15 gold-standard queries across all 5 intent types on a ~1,500-note vault. Each query has known correct notes — the eval measures whether vault_search finds them and how high they rank.

Note on running evals yourself: The gold queries in eval/gold.json are tied to the specific vault structure used during development. Before running eval/run_eval.py against your own vault, replace the contents of eval/gold.json with queries and expected note paths that match your vault.

Overall retrieval quality

Retrieval quality results

  • Hit@5 = 93.3% — 14 out of 15 queries had the correct note in the top 5 results
  • Hit@10 = 100% — every query found the correct note somewhere in the top 10
  • MRR = 0.77 — "Mean Reciprocal Rank," a standard information retrieval metric. An MRR of 0.77 means the correct answer typically appears at rank 1 or 2. (MRR = 1.0 would mean every query returns the correct note first.)

The single miss at Hit@5 was a serendipity query — intentionally open-ended, where the expected notes were only weakly connected to the query terms.

Why not just use one search strategy?

Fusion vs single channel

If you committed to a single strategy for all queries, the best you could do is BM25 at MRR 0.63. Semantic search scores 0.56. Graph traversal scores 0.27. Fused retrieval scores 0.77 — 22% better than the best single strategy, because different query types need different strategies, and fusion handles the routing automatically.

Architecture

                    +----------------------------------+
                    |         vault_search()           |
                    |   intent routing -> weight vector |
                    +---------------+------------------+
                                    |
               +--------------------+--------------------+
               |  asyncio.gather (all channels parallel) |
               +--------------------+--------------------+
               |          |          |          |
          +----v--+  +---v---+  +--v---+  +--v--------+
          | BM25  |  |Semantic|  |Graph |  |Hindsight  |
          |Okapi  |  |cosine  |  | BFS  |  | recall    |
          +---+---+  +---+---+  +--+---+  +--+--------+
              |          |         |          |
              +----------+----+----+----------+
                              |
                     +--------v--------+
                     |  RRF Fusion     |
                     |  weighted by    |
                     |  intent vector  |
                     +--------+--------+
                              |
                     +--------v--------+
                     |  Deduplicate,   |
                     |  normalize,     |
                     |  threshold,     |
                     |  annotate       |
                     +-----------------+

Channels

BM25 — Indexes all .md files using BM25Okapi with mtime-based cache invalidation. Handles keyword and exact-term queries. Returns normalized scores with excerpt extraction.

Semantic — Reads the pre-built embedding index from the Smart Connections Obsidian plugin. Embeds queries in-process via sentence-transformers (TaylorAI/bge-micro-v2). No separate vector database — piggybacks on the plugin's existing index.

WikiLink Graph — Parses [[WikiLinks]] across the vault and builds an adjacency graph. Forward BFS for context_load (relevance decays by hop depth), reverse scan for backlink. This is the channel that exploits the structure unique to linked-note systems.

Hindsight — Queries a Hindsight memory service for facts retained from prior AI conversations. Optional — disabled when HINDSIGHT_URL is unset.

All channels degrade gracefully. If a dependency is missing or a service is unreachable, that channel returns an empty list instead of an error. The server works with whatever is available.

Fusion

Results from all channels are merged via Reciprocal Rank Fusion (k=60), weighted by the intent's channel vector. Scores are normalized so the top result is always 1.0, making the threshold parameter meaningful regardless of how many channels contributed.

Every result includes a match_reason explaining why it was returned — enforced at the schema level.

Ollama dependency

Intent classification uses a local Ollama model (gemma2:2b by default, ~800ms). This is only needed when the caller doesn't pass an explicit intent. Query embedding runs in-process and does not require Ollama.

If Ollama is unreachable, intent defaults to conceptual — semantic search still works, you just lose automatic routing.

Tool Schema

vault_search(
    query: str,              # Natural language query (required)
    intent: str | None,      # One of the 5 intents; auto-classified if omitted
    root_note: str | None,   # Vault-relative path; anchor for context_load
    max_results: int = 10,
    hop_depth: int = 2,      # WikiLink traversal depth
    threshold: float = 0.6,  # Minimum relevance score (0-1)
)

Each result includes: path, title, channel, relevance, match_reason, excerpt, depth, connected_via.

Setup

Local Development

pip install -e ".[dev]"
cp .env.example .env
# Edit .env — set VAULT_PATH at minimum
python -m vault_graphrag.server

Docker

docker build -t vault-graphrag .
docker run -d \
  -p 8765:8765 \
  -v /path/to/your/vault:/vault:ro \
  -e VAULT_PATH=/vault \
  vault-graphrag
curl http://localhost:8765/health

The vault is mounted read-only. When deployed alongside Hindsight, both services should share a Docker network. Set HINDSIGHT_URL to the container-internal address (e.g. http://hindsight:8000). Hindsight is optional — omit HINDSIGHT_URL to disable the channel.

MCP Client Configuration

SSE transport (Docker or remote):

{
  "vault-graphrag": {
    "type": "sse",
    "url": "http://localhost:8765/sse"
  }
}

Stdio transport (local):

{
  "vault-graphrag": {
    "type": "stdio",
    "command": "python",
    "args": ["-m", "vault_graphrag.server"]
  }
}

Configuration

Variable Required Default Description
VAULT_PATH Yes -- Absolute path to the Obsidian vault
OLLAMA_URL No http://localhost:11434 Ollama base URL (intent classification only)
OLLAMA_INTENT_MODEL No gemma2:2b Model for intent classification
HINDSIGHT_URL No (disabled) Hindsight REST API base URL
MCP_TRANSPORT No sse sse or stdio
MCP_HOST No 0.0.0.0 Bind address
MCP_PORT No 8765 Bind port

Health Check

GET /health
{
  "status": "ok",
  "channels": {
    "bm25": true,
    "semantic": true,
    "graph": true,
    "hindsight": true
  },
  "vault_path": "/vault",
  "note_count": 1024
}

"ok" = all channels available. "degraded" = functional but partial.

vault_path is included in the response body to aid debugging of misconfigured deployments (e.g. wrong mount path in Docker). It is omitted if VAULT_PATH is unset.

Design Decisions

  • No re-indexing. Reads the Smart Connections plugin's existing embedding index instead of maintaining a separate vector store.
  • Intent as the API surface. Internal routing can change without breaking callers.
  • Hindsight as a retrieval channel. Vault notes and conversation memories come back in one fused response.
  • match_reason is mandatory. Enforced at the Pydantic model level. No result is returned without an explanation.
  • serendipity is a first-class intent. Discovery gets its own weight vector optimized for latent connections at the semantic + graph intersection.
  • Graceful degradation. Any channel failure returns empty, never errors.

Research Basis

  • GraphRAG (Microsoft, 2024) — hybrid vector + graph retrieval with Reciprocal Rank Fusion
  • MIND-RAG — intent-aware routing dispatching to specialized retrieval agents
  • Hybrid RAG — dense semantic + sparse lexical (BM25) fusion
  • Luhmann (1981), "Communicating with Slip Boxes" — the zettelkasten as communication partner with emergent properties from link density

Stack

  • Python 3.11+ / FastMCP / Pydantic
  • rank-bm25 (lexical search)
  • sentence-transformers (query embedding, in-process)
  • NumPy (cosine similarity)
  • httpx (async HTTP for Hindsight + Ollama)
  • Ollama (intent classification only)

License

MIT

About

GraphRAG search for your second brain. Four retrieval strategies, one MCP tool, best answer.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors