Skip to content

Commit 3c258c7

Browse files
RecoDemoclaude
andcommitted
Add CPython benchmarks (1.1M lines), protect registry tokens
- Update benchmarks section with CPython results: 59,620 functions, 9,037 classes indexed in 55.9s / 197 MB - find_symbol returns 67 chars against 41M source (99.9998% reduction) - get_change_impact finds 154 direct + 492 transitive dependents in 0.45ms - Add server.json for MCP registry publishing - Add .mcpregistry_* to gitignore to protect auth tokens Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent a0732c5 commit 3c258c7

3 files changed

Lines changed: 48 additions & 16 deletions

File tree

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
.env
33
.env.*
44
!.env.example
5+
.mcpregistry_*
56
*.pem
67
*.key
78
*.crt

README.md

Lines changed: 20 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -170,7 +170,7 @@ This ensures the AI reaches for surgical indexed queries first, which saves toke
170170

171171
## Benchmarks
172172

173-
Tested across three real-world projects on an M-series MacBook Pro:
173+
Tested across four real-world projects on an M-series MacBook Pro, from a small project to CPython itself (1.1 million lines):
174174

175175
### Index Build Performance
176176

@@ -179,30 +179,34 @@ Tested across three real-world projects on an M-series MacBook Pro:
179179
| RMLPlus | 36 | 7,762 | 237 | 55 | 0.9s | 2.4 MB |
180180
| FastAPI | 2,556 | 332,160 | 4,139 | 617 | 5.7s | 55 MB |
181181
| Django | 3,714 | 707,493 | 29,995 | 7,371 | 36.2s | 126 MB |
182+
| **CPython** | **2,464** | **1,115,334** | **59,620** | **9,037** | **55.9s** | **197 MB** |
182183

183184
### Query Response Size vs Total Source
184185

185-
| Query | RMLPlus (292K source) | FastAPI (12.2M source) | Django (26.3M source) |
186-
|-------|---:|---:|---:|
187-
| `find_symbol` | 61 chars | 68 chars | 62 chars |
188-
| `get_dependencies` | 94 chars | 56 chars | 327 chars |
189-
| `get_change_impact` | 1,255 chars | 61 chars | 195,640 chars |
190-
| `get_function_source` | 3,313 chars | 4,612 chars | 682 chars |
186+
Querying CPython — 41 million characters of source code:
191187

192-
`find_symbol` returns 61-68 characters regardless of whether the project is 7K lines or 707K lines. Response size scales with the answer, not the codebase.
188+
| Query | Response | Total Source | Reduction |
189+
|-------|-------:|------------:|----------:|
190+
| `find_symbol("TestCase")` | 67 chars | 41,077,561 chars | **99.9998%** |
191+
| `get_dependencies("compile")` | 115 chars | 41,077,561 chars | **99.9997%** |
192+
| `get_change_impact("TestCase")` | 16,812 chars | 41,077,561 chars | **99.96%** |
193+
| `get_function_source("compile")` | 4,531 chars | 41,077,561 chars | **99.99%** |
194+
| `get_function_source("run_unittest")` | 439 chars | 41,077,561 chars | **99.999%** |
193195

194-
Django's `get_change_impact("AppConfig")` found 65 direct dependents and 5,078 transitive dependents — the kind of query that's impossible without a dependency graph. Use `max_direct` and `max_transitive` to cap output to your token budget.
196+
`find_symbol` returns 54-67 characters regardless of whether the project is 7K lines or 1.1M lines. Response size scales with the answer, not the codebase.
197+
198+
`get_change_impact("TestCase")` on CPython found **154 direct dependents and 492 transitive dependents** in 0.45ms — the kind of query that's impossible without a dependency graph. Use `max_direct` and `max_transitive` to cap output to your token budget.
195199

196200
### Query Response Time
197201

198-
All targeted queries return in sub-millisecond time, even on Django's 707K lines:
202+
All targeted queries return in sub-millisecond time, even on CPython's 1.1M lines:
199203

200-
| Query | RMLPlus | FastAPI | Django |
201-
|-------|--------:|--------:|-------:|
202-
| `find_symbol` | 0.01ms | 0.01ms | 0.03ms |
203-
| `get_dependencies` | 0.00ms | 0.00ms | 0.00ms |
204-
| `get_change_impact` | 0.02ms | 0.00ms | 2.81ms |
205-
| `get_function_source` | 0.01ms | 0.02ms | 0.03ms |
204+
| Query | RMLPlus | FastAPI | Django | CPython |
205+
|-------|--------:|--------:|-------:|--------:|
206+
| `find_symbol` | 0.01ms | 0.01ms | 0.03ms | 0.08ms |
207+
| `get_dependencies` | 0.00ms | 0.00ms | 0.00ms | 0.01ms |
208+
| `get_change_impact` | 0.02ms | 0.00ms | 2.81ms | 0.45ms |
209+
| `get_function_source` | 0.01ms | 0.02ms | 0.03ms | 0.10ms |
206210

207211
Run the benchmarks yourself: `python benchmarks/benchmark.py`
208212

server.json

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
{
2+
"$schema": "https://static.modelcontextprotocol.io/schemas/2025-12-11/server.schema.json",
3+
"name": "io.github.MikeRecognex/mcp-codebase-index",
4+
"description": "Structural codebase indexer with 17 query tools. 87% token reduction. Zero dependencies.",
5+
"repository": {
6+
"url": "https://github.com/MikeRecognex/mcp-codebase-index",
7+
"source": "github"
8+
},
9+
"version": "0.2.2",
10+
"packages": [
11+
{
12+
"registryType": "pypi",
13+
"identifier": "mcp-codebase-index",
14+
"version": "0.2.2",
15+
"transport": {
16+
"type": "stdio"
17+
},
18+
"environment_variables": [
19+
{
20+
"name": "PROJECT_ROOT",
21+
"description": "Root directory of the project to index",
22+
"required": true
23+
}
24+
]
25+
}
26+
]
27+
}

0 commit comments

Comments
 (0)