Craft AI souls from humanity's greatest minds.
SoulCraft is a data-driven persona extraction pipeline that transforms real human data (interviews, speeches, letters, social media) into structured AI personality definitions compatible with multi-agent systems like OpenClaw, Claude Code, and Codex.
"We don't write personas. We extract them from reality."
Imagine a world where the greatest minds in human history can collaborate:
- Cao Cao (ๆนๆ) routes and reviews all decisions as your CEO
- Warren Buffett reviews your investment thesis
- Elon Musk challenges your product with first-principles thinking
- Linus Torvalds reviews your code architecture
- Zhuge Liang (่ฏธ่ไบฎ) architects your system design
SoulCraft makes this possible โ not through shallow role-play prompts, but through deep, evidence-based persona extraction from real human data. Each soul is an atomic, reusable unit. Mix and match freely across teams.
L0 Adapter L1 Extraction L1.5 Aggregation L2 Profile +1 System Prompt
โโโโโโโโโโโ โโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโ โโโโโโโโโโโโโโโโ
Heterogeneous โโโ Structured โโโ Knowledge โโโ Persona โโโ soul.md /
Data Sources Extraction Bible Profile System Prompt
(interviews, (per-file) (unified) (ABCDE layers) (deployable)
letters, tweets)
| Layer | Name | What it captures |
|---|---|---|
| A | Soul | Core identity, worldview, beliefs |
| B | Cognition | Knowledge systems, theories, intellectual sources |
| C | Expression | Speech patterns, catchphrases, rhetorical style |
| D | Behavior | Situational responses, teaching style, preferences |
| E | Meta | Blind spots (explicit + inferred), conflict resolution patterns |
Converts heterogeneous data into a unified transcript format. Zero LLM dependency โ pure structural conversion.
# Interview subtitles โ dialogue format
python -m l0_adapter --type DLG \
--input podcast_interview.srt \
--output transcripts/ \
--target-speaker "Elon Musk"
# Shareholder letters โ monologue format
python -m l0_adapter --type MON \
--input "buffett_letters/*.pdf" \
--output transcripts/ \
--target-speaker "Warren Buffett"
# Tweets โ grouped micro-burst format
python -m l0_adapter --type MIC \
--input tweets.json \
--output transcripts/ \
--target-speaker "Elon Musk" \
--group-by threadSource types by communication mode:
| Type | Mode | Sources |
|---|---|---|
DLG |
Dialogue | Interviews, podcasts, email threads |
MON |
Monologue | Letters, speeches, blog posts |
MIC |
Micro-burst | Tweets, social media posts |
ATT |
Attributed | Biographies, news articles |
A 4-stage pipeline driven by a master prompt template:
- L1 Knowledge Extraction โ Per-file structured extraction with source tracing
- L1.5 Aggregation โ Knowledge graph + persona data consolidation
- L2 Persona Profile โ ABCDE five-layer persona decomposition
- +1 System Prompt โ Compile into deployable
soul.md/ System Prompt
soulcraft/
โโโ souls/ โ Atomic units
โ โโโ cao-cao/
โ โ โโโ soul.md โ Base soul: pure personality
โ โ โโโ teams/three-kingdoms/
โ โ โโโ soul.md โ Team-tuned: personality + orchestration directives
โ โโโ zhuge-liang/
โ โ โโโ soul.md
โ โ โโโ teams/three-kingdoms/soul.md
โ โโโ warren-buffett/soul.md
โ โโโ elon-musk/soul.md
โ
โโโ teams/ โ Team manifest files
โ โโโ three-kingdoms.yaml โ References team-tuned souls
โ โโโ dream-company.yaml
โ
โโโ l0_adapter/ โ Data source converter (DLG/MON/MIC/ATT)
โโโ docs/ โ Pipeline templates & documentation
Like open-source LLMs, SoulCraft provides two modes:
| Mode | Analogy | What it contains |
|---|---|---|
| Base Soul | Qwen-base |
Pure personality. No orchestration. Use as-is or fine-tune yourself. |
| Team-Tuned Soul | Qwen-instruct |
Personality + orchestration directives (who reports to whom, conflict rules, trust levels). |
Orchestration is embedded in the persona file, not in a separate routing engine. The host framework (OpenClaw, CrewAI, AutoGen) handles execution.
A team-tuned soul adds:
- Chain of command (who is the superior, who are subordinates)
- Request routing rules (what to handle, what to delegate)
- Conflict arbitration strategy
- Trust level and review policies
| Role | Soul | Why |
|---|---|---|
| CEO / Router | Cao Cao (ๆนๆ) | Cross-validates outputs, trust-level based delegation, fast course-correction |
| COO | Xun Yu (่ๅฝง) | Resource allocation, operational planning |
| CSO | Guo Jia (้ญๅ) | Risk assessment, competitive strategy |
| CTO | Zhuge Liang (่ฏธ่ไบฎ) | System architecture, technical decisions |
| VP Engineering | Zhang Liao (ๅผ ่พฝ) | Execution, implementation, delivery |
| Red Team | Sima Yi (ๅธ้ฉฌๆฟ) | Adversarial review, find weaknesses |
Built-in conflict dynamics: CTO vs Red Team (healthy adversarial tension), CSO (aggressive) vs COO (conservative) strategic balance, CEO as final arbiter.
SoulCraft is a compiler, not a runtime. It produces soul artifacts; host frameworks execute them.
# Export to different frameworks
soulcraft export warren-buffett --target openclaw # โ .openclaw/soul.md
soulcraft export warren-buffett --target crewai # โ Agent(role, backstory, ...)
soulcraft export warren-buffett --target autogen # โ AssistantAgent config.openclaw/
โโโ soul.md โ Generated by SoulCraft
โโโ identity.md
โโโ agents.md
Under the hood, SoulCraft maintains a soul.json (or .yaml) as the source of truth. soul.md is a human-readable compilation target. The IR includes ABCDE layers, provenance (source quotes, confidence scores, parser type), and version metadata.
git clone https://github.com/who96/soulcraft.git
cd soulcraft
# Convert interview subtitles to unified format
python -m l0_adapter --type DLG \
--input examples/sample_interview.srt \
--output examples/transcripts/ \
--target-speaker "Example Person"
# Then use the 3+1 pipeline prompt template to extract persona
# (See docs/pipeline_prompt_template.md)- L0 Adapter Usage
- 3+1 Pipeline Prompt Template
- ABCDE Persona Model
- Integration with OpenClaw
- Team Templates Roadmap
- 3+1 Pipeline v5 with ABCDE model
- L0 Adapter (Dialogue, Monologue, Micro-burst, Attributed parsers)
- Canonical soul schema (JSON/YAML IR with provenance)
- Example base souls: Linus, Buffett (end-to-end)
- Verification v0 (holdout set + automated eval script)
- OpenClaw soul.md compiler + runnable demo
- Team schema (
team_schema.json) โ sequential routing, array order = execution order - Team compiler (
team_compile.py) โ team.yaml โ team-tuned soul.md - First team: Code Review (Linus + Buffett) โ end-to-end demo
-
demo.py --teamโ sequential multi-soul pipeline with handoff - 84 tests passing (53 Phase 1 + 31 Phase 2)
- More base souls: Cao Cao, Munger, Zhuge Liang
- Three Kingdoms team-tuned souls
- OpenClaw team-aware packaging
- CrewAI / AutoGen exporters
- Contributor submission template
- Web UI for soul browsing + evidence chain
- More team templates (Dream Company, China Business)
- Multi-agent interaction playground
We welcome contributions! Key areas:
- New base souls โ Extract personas and submit
soul.mdwith evidence - New L0 parsers โ Support more data source formats
- Team-tuned variants โ Create orchestration-aware soul variants
- Evaluation metrics โ Better ways to measure persona fidelity
- Framework exporters โ Connect with more agent frameworks (OpenClaw, CrewAI, AutoGen)
MIT License
- Character-LLM โ Experience Reconstruction inspiration
- Microsoft TinyTroupe โ Multi-agent persona simulation
- OpenClaw โ
soul.mdecosystem
"Bad programmers worry about the code. Good programmers worry about data structures and their relationships." โ Linus Torvalds
SoulCraft worries about the soul.