Inherits from ../CLAUDE.md. Read that first.
Decompose is the deterministic foundation of the echology system. It classifies any text into structured semantic units — authority, risk, attention, entities — without an LLM. No probability. No hallucination. No cost.
This is the most important project in echology. Every other system depends on it. Changes here propagate everywhere.
Five modules, one pipeline:
text → chunker → classifier → entities → irreducibility → DecomposeResult
| Module | What It Does |
|---|---|
chunker.py |
Semantic chunking, Markdown-aware, sentence-boundary splitting |
classifier.py |
Regex-based authority + risk + content type classification. Attention scoring. |
entities.py |
Regex entity extraction: standards, dates, financial values, legal references |
irreducibility.py |
Detects content that must be preserved verbatim (specs, limits, formulas) |
core.py |
Orchestration. decompose_text() and filter_for_llm() |
- Python library:
from decompose import decompose_text, filter_for_llm - CLI:
decompose --text "..." --prettyor piped stdin - MCP server:
decompose-mcp --serve(exposesdecompose_textanddecompose_urltools)
- Zero runtime dependencies. The library itself imports nothing outside the standard library.
mcpis only required for MCP server mode. Do not add dependencies. - Deterministic. Same input always produces same output. No randomness. No LLM calls. No network calls (except
decompose_urlwhich fetches the URL, then classifies deterministically). - Published on PyPI as
decompose-mcp. Changes must not break the public API (decompose_text,filter_for_llm,DecomposeResult,Unit). - 63 tests. Run
pytestbefore any commit to a core module. Do not reduce test coverage. - Regex patterns are the core IP. When modifying classifier patterns, test against real documents from multiple domains (AEC, insurance, legal, general). A pattern that improves one domain but breaks another is rejected.
src/decompose/core.py— entry point,decompose_text(),filter_for_llm()src/decompose/classifier.py— authority/risk/content patterns and scoringsrc/decompose/entities.py— entity extraction patternssrc/decompose/irreducibility.py— verbatim preservation detectionsrc/decompose/chunker.py— semantic text chunkingsrc/decompose/mcp_server.py— MCP server implementationsrc/decompose/cli.py— CLI entry pointtests/— 63 tests across 6 modules
cd ~/echology/decompose
source .venv/bin/activate
pytest # run tests
ruff check src/ tests/ # lint
ruff format src/ tests/ # format