22% better retrieval than any single search strategy — a hybrid GraphRAG server that makes your second brain actually searchable.
If you've built a knowledge base in Obsidian — notes, projects, research, meeting logs — you've probably hit the same wall everyone hits: you know the answer is in there somewhere, but search can't find it. You spend more time looking for notes than using them. The system that was supposed to make you smarter starts feeling like a cluttered drawer.
vault-graphrag fixes this by running four search strategies in parallel and fusing the results. It's exposed as a single MCP tool — one call, best answer, regardless of which strategy found it.
The promise of a "second brain" is that you capture knowledge once and retrieve it when you need it. The reality is that retrieval is where these systems fail. You save hundreds of notes, build links between them, and then can't find the one you need because:
- You search for "motor settings" but the note is titled "VFD Configuration" — keyword search misses it
- You search semantically for "project planning" and get conceptually similar notes, but not the specific project note that's two links away from a related note
- You browse backlinks manually, but there's no way to rank which linked note is actually relevant to your question
Each search strategy works for some queries and fails for others. The problem isn't that your notes are disorganized — it's that no single search method can handle the variety of ways you need to find things.
This problem gets worse with AI agents. An agent searching your vault on your behalf has to guess which strategy to use, make multiple tool calls, and piece together results from different tools. Pick wrong and the answer doesn't come back at all.
Obsidian stores notes as plain Markdown files locally. What makes it powerful for knowledge management is linking — any note can reference another with [[WikiLinks]], and those links are bidirectional. Over time, your notes form a knowledge graph: a web of connections that you authored, where the links themselves carry meaning. A link from a project note to [[stepper motors]] is a claim: "this concept is relevant here."
This is the principle behind the Zettelkasten method — the idea that knowledge lives in the relationships between notes, not just in the notes themselves. vault-graphrag is built to exploit that structure.
vault_search runs four retrieval strategies in parallel, weights them based on what kind of answer the query needs, and returns one fused result set. The caller declares an intent — what kind of search this is — and the server handles the routing:
| Intent | When to use it | Strategies emphasized |
|---|---|---|
factual_lookup |
Looking for a specific fact, name, or setting | Keyword search + memory recall |
context_load |
Loading everything related to a project or topic | WikiLink graph traversal |
conceptual |
Exploring ideas or finding thematically related notes | Semantic similarity |
backlink |
Finding what links to a specific note | Reverse WikiLink scan |
serendipity |
Open-ended exploration, discovering unexpected connections | Semantic + graph equally |
If the caller doesn't specify an intent, a local LLM classifies the query automatically.
Hindsight is a long-term memory service for AI agents. It ingests conversations over time, extracts structured facts (decisions made, preferences stated, events that happened), and builds a persistent knowledge graph from them. vault-graphrag can optionally query Hindsight as a fourth retrieval channel, so a single search returns both your vault notes and facts the agent remembers from past conversations — even if those facts were never written down as notes.
Yes. Evaluated against 15 gold-standard queries across all 5 intent types on a ~1,500-note vault. Each query has known correct notes — the eval measures whether vault_search finds them and how high they rank.
Note on running evals yourself: The gold queries in
eval/gold.jsonare tied to the specific vault structure used during development. Before runningeval/run_eval.pyagainst your own vault, replace the contents ofeval/gold.jsonwith queries and expected note paths that match your vault.
- Hit@5 = 93.3% — 14 out of 15 queries had the correct note in the top 5 results
- Hit@10 = 100% — every query found the correct note somewhere in the top 10
- MRR = 0.77 — "Mean Reciprocal Rank," a standard information retrieval metric. An MRR of 0.77 means the correct answer typically appears at rank 1 or 2. (MRR = 1.0 would mean every query returns the correct note first.)
The single miss at Hit@5 was a serendipity query — intentionally open-ended, where the expected notes were only weakly connected to the query terms.
If you committed to a single strategy for all queries, the best you could do is BM25 at MRR 0.63. Semantic search scores 0.56. Graph traversal scores 0.27. Fused retrieval scores 0.77 — 22% better than the best single strategy, because different query types need different strategies, and fusion handles the routing automatically.
+----------------------------------+
| vault_search() |
| intent routing -> weight vector |
+---------------+------------------+
|
+--------------------+--------------------+
| asyncio.gather (all channels parallel) |
+--------------------+--------------------+
| | | |
+----v--+ +---v---+ +--v---+ +--v--------+
| BM25 | |Semantic| |Graph | |Hindsight |
|Okapi | |cosine | | BFS | | recall |
+---+---+ +---+---+ +--+---+ +--+--------+
| | | |
+----------+----+----+----------+
|
+--------v--------+
| RRF Fusion |
| weighted by |
| intent vector |
+--------+--------+
|
+--------v--------+
| Deduplicate, |
| normalize, |
| threshold, |
| annotate |
+-----------------+
BM25 — Indexes all .md files using BM25Okapi with mtime-based cache invalidation. Handles keyword and exact-term queries. Returns normalized scores with excerpt extraction.
Semantic — Reads the pre-built embedding index from the Smart Connections Obsidian plugin. Embeds queries in-process via sentence-transformers (TaylorAI/bge-micro-v2). No separate vector database — piggybacks on the plugin's existing index.
WikiLink Graph — Parses [[WikiLinks]] across the vault and builds an adjacency graph. Forward BFS for context_load (relevance decays by hop depth), reverse scan for backlink. This is the channel that exploits the structure unique to linked-note systems.
Hindsight — Queries a Hindsight memory service for facts retained from prior AI conversations. Optional — disabled when HINDSIGHT_URL is unset.
All channels degrade gracefully. If a dependency is missing or a service is unreachable, that channel returns an empty list instead of an error. The server works with whatever is available.
Results from all channels are merged via Reciprocal Rank Fusion (k=60), weighted by the intent's channel vector. Scores are normalized so the top result is always 1.0, making the threshold parameter meaningful regardless of how many channels contributed.
Every result includes a match_reason explaining why it was returned — enforced at the schema level.
Intent classification uses a local Ollama model (gemma2:2b by default, ~800ms). This is only needed when the caller doesn't pass an explicit intent. Query embedding runs in-process and does not require Ollama.
If Ollama is unreachable, intent defaults to conceptual — semantic search still works, you just lose automatic routing.
vault_search(
query: str, # Natural language query (required)
intent: str | None, # One of the 5 intents; auto-classified if omitted
root_note: str | None, # Vault-relative path; anchor for context_load
max_results: int = 10,
hop_depth: int = 2, # WikiLink traversal depth
threshold: float = 0.6, # Minimum relevance score (0-1)
)
Each result includes: path, title, channel, relevance, match_reason, excerpt, depth, connected_via.
pip install -e ".[dev]"
cp .env.example .env
# Edit .env — set VAULT_PATH at minimum
python -m vault_graphrag.serverdocker build -t vault-graphrag .
docker run -d \
-p 8765:8765 \
-v /path/to/your/vault:/vault:ro \
-e VAULT_PATH=/vault \
vault-graphrag
curl http://localhost:8765/healthThe vault is mounted read-only. When deployed alongside Hindsight, both services should share a Docker network. Set HINDSIGHT_URL to the container-internal address (e.g. http://hindsight:8000). Hindsight is optional — omit HINDSIGHT_URL to disable the channel.
SSE transport (Docker or remote):
{
"vault-graphrag": {
"type": "sse",
"url": "http://localhost:8765/sse"
}
}Stdio transport (local):
{
"vault-graphrag": {
"type": "stdio",
"command": "python",
"args": ["-m", "vault_graphrag.server"]
}
}| Variable | Required | Default | Description |
|---|---|---|---|
VAULT_PATH |
Yes | -- | Absolute path to the Obsidian vault |
OLLAMA_URL |
No | http://localhost:11434 |
Ollama base URL (intent classification only) |
OLLAMA_INTENT_MODEL |
No | gemma2:2b |
Model for intent classification |
HINDSIGHT_URL |
No | (disabled) | Hindsight REST API base URL |
MCP_TRANSPORT |
No | sse |
sse or stdio |
MCP_HOST |
No | 0.0.0.0 |
Bind address |
MCP_PORT |
No | 8765 |
Bind port |
GET /health
{
"status": "ok",
"channels": {
"bm25": true,
"semantic": true,
"graph": true,
"hindsight": true
},
"vault_path": "/vault",
"note_count": 1024
}"ok" = all channels available. "degraded" = functional but partial.
vault_path is included in the response body to aid debugging of misconfigured deployments (e.g. wrong mount path in Docker). It is omitted if VAULT_PATH is unset.
- No re-indexing. Reads the Smart Connections plugin's existing embedding index instead of maintaining a separate vector store.
- Intent as the API surface. Internal routing can change without breaking callers.
- Hindsight as a retrieval channel. Vault notes and conversation memories come back in one fused response.
match_reasonis mandatory. Enforced at the Pydantic model level. No result is returned without an explanation.serendipityis a first-class intent. Discovery gets its own weight vector optimized for latent connections at the semantic + graph intersection.- Graceful degradation. Any channel failure returns empty, never errors.
- GraphRAG (Microsoft, 2024) — hybrid vector + graph retrieval with Reciprocal Rank Fusion
- MIND-RAG — intent-aware routing dispatching to specialized retrieval agents
- Hybrid RAG — dense semantic + sparse lexical (BM25) fusion
- Luhmann (1981), "Communicating with Slip Boxes" — the zettelkasten as communication partner with emergent properties from link density
- Python 3.11+ / FastMCP / Pydantic
- rank-bm25 (lexical search)
- sentence-transformers (query embedding, in-process)
- NumPy (cosine similarity)
- httpx (async HTTP for Hindsight + Ollama)
- Ollama (intent classification only)
MIT


