A distributed, modular evolutionary framework where specialist agents collaboratively generate, train, evaluate, mutate, and refine candidate learning systems represented as genomes.
# 1. Create venv and install deps
python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt
# 2. Run the evolution loop (mock training, math benchmark)
python3 main.py
# 3. Inspect the SQLite registry
python3 scripts/inspect_registry.py evo_swarm.db# Batch-convert PDFs to text
python3 scripts/pdf_to_txt.py --input papers/ --output papers_txt/
# Or ingest directly (PDFs handled automatically)
python3 -m evo_swarm.offline.cli ingest papers/evo_swarm/
core/ # Interfaces, events, registry, scheduler
agents/ # Architect, Trainer, Evaluator, Curator, Critic
evolution/ # Generation manager, mutation, crossover
training/ # Pluggable training backends (mock, local_llm)
offline/ # Offline swarm with LLM roles, knowledge store, tools
benchmarks/ # Domain-specific evaluation (math, neuro, philosophy)
models/ # Pluggable model implementations
memory/ # Fast, episodic, knowledge memory
tracking/ # Experiments, metrics, lineage
infra/ # Local, distributed, edge compute
api/ # REST API (future)
dashboard/ # Web dashboard (future)
ai/local_llm/ # From-scratch GPT trainer (PyTorch)
scripts/ # Utility scripts (PDF converter, registry inspector)
The system uses an event-driven architecture where specialist agents communicate through typed events:
- Curator prepares datasets and kicks off generations
- Architect proposes candidate genome configurations
- Trainer trains candidates (mock or real backends)
- Evaluator benchmarks candidates and computes fitness
- Critic/Mutator analyzes failures and proposes mutations
- GenerationManager tracks lineage in SQLite
| Env Var | Default | Description |
|---|---|---|
EVO_SWARM_TRAIN_BACKEND |
mock |
Training backend (mock, local_llm) |
LOCAL_LLM_DATA_DIR |
ai/local_llm/data/processed |
Data dir for local_llm backend |
LOCAL_LLM_MAX_STEPS |
200 |
Max training steps per candidate |
LOCAL_LLM_EVAL_EVERY |
100 |
Eval interval (steps) for local_llm backend |
LOCAL_LLM_SAVE_EVERY |
LOCAL_LLM_MAX_STEPS |
Checkpoint interval (steps) for local_llm backend |
LOCAL_LLM_RUNS_DIR |
ai/local_llm/runs/evo_swarm |
Output runs dir for local_llm backend |
The default swarm trainer is mock (fast and dependency-free). To train a small GPT-from-scratch model
inside the swarm loop, wire the ai/local_llm trainer as the backend.
- Install local trainer deps:
python3 -m pip install -r ai/local_llm/requirements.txt- Put your text data in:
ai/local_llm/data/text/**/*.txt
- Build tokenizer + token dataset:
python3 ai/local_llm/scripts/prepare_data.py \
--text_dir ai/local_llm/data/text \
--out_dir ai/local_llm/data/processed \
--vocab_size 16000- Run the swarm with
local_llmtraining:
EVO_SWARM_TRAIN_BACKEND=local_llm python3 main.pyArtifacts are written under ai/local_llm/runs/evo_swarm/ (per generation/candidate).
The evo_swarm.offline flow is designed for offline “ingest → retrieve → plan/critic → log for later fine-tuning”.
# Ingest a folder of .txt/.md papers
python3 -m evo_swarm.offline.cli ingest papers_txt/
# Ask a question grounded in ingested notes
python3 -m evo_swarm.offline.cli ask "Summarize the key findings about X"
# Interactive chat; optionally auto-train every N replies (training is a stub by default)
python3 -m evo_swarm.offline.cli chat --auto-train-every 20 --train-out offline_training_outIf you’re browsing scientific datasets/LLM resources (e.g. the awesome list InternScience/Awesome-Scientific-Datasets-and-LLMs), use it to pick sources that you can legally download and store locally, then:
- For RAG-style Q&A: ingest extracted
.txt/.mdinto the offline CLI. - For training
ai/local_llm: convert the corpus into.txtfiles underai/local_llm/data/text/, then runprepare_data.py.
If your editor shows “red lines” from lint/type diagnostics, these two commands usually match what’s happening:
ruff check . --fix
pyrightNote: pyright will report missing imports for ai/local_llm/scripts/* unless you install ai/local_llm/requirements.txt
into the same Python environment your editor uses.