Releases: spboyer/sensei
v1.4.0 — GEPA Integration
What's New
GEPA Integration — Evolutionary Skill Optimization 🧬
Sensei now supports GEPA (Genetic-Pareto) evolutionary optimization as an optional enhancement to the Ralph loop.
When invoked with --gepa, sensei replaces its template-based improve step with LLM-driven evolutionary optimization that:
- Auto-discovers the skill's test harness at runtime
- Builds an evaluator that scores candidates on content quality + trigger accuracy
- Proposes improvements via LLM (GitHub Models), keeping only versions that score higher
- Feeds test failures as ASI so the LLM knows why a candidate failed
Usage
Run sensei on my-skill --gepa
Run sensei score my-skill
Results on GitHub Copilot for Azure skills
| Skill | Before | After |
|---|---|---|
| azure-storage | 0.16 | 1.00 |
| entra-app-registration | 0.38 | 1.00 |
| microsoft-foundry | 0.50 | 1.00 |
| azure-deploy | 0.62 | 1.00 |
Other Improvements
- Docs alignment — Scoring criteria, anti-trigger guidance, and examples now consistent across SKILL.md, README.md, and AGENTS.md
- Safer frontmatter parsing — Graceful handling of malformed YAML (no more crashes)
- Better trigger extraction — Handles apostrophes, backtick template literals, and strips JS comments
- CI-friendly exit codes — Non-zero exit on errors for pipeline reliability
- Python dependency spec — Added
requirements.txtwith pinned GEPA version
Acknowledgments
Thanks to Pamela Fox for the pointer to GEPA and optimize_anything — a great fit for iterative skill improvement.
Full Changelog: v1.3.0...v1.4.0
v1.0.0
Sensei is a skill for improving agent skill frontmatter compliance. It scores, diagnoses, and iteratively fixes SKILL.md files so agents route to the right skill at the right time.
Features
-
Ralph loop pattern — Iterative read → score → improve → verify → check tokens → summarize cycle. Runs up to 5 iterations per skill until Medium-High compliance is reached, then prompts: commit, create issue, or skip.
-
Compliance scoring — Rule-based evaluation across four levels: Low → Medium → Medium-High → High. Checks description length, trigger phrases (
USE FOR:), anti-triggers (DO NOT USE FOR:), and routing clarity (INVOKES:/FOR SINGLE OPERATIONS:). -
Token CLI — Five subcommands for managing skill token budgets:
count— Token counts for any markdown filescheck— Validate files against configurable soft/hard limitssuggest— Optimization recommendationscompare— Git-based token delta between refsscore— Advisory scoring checks on a skill directory
-
Advisory scoring checks — Five built-in intelligence checks that surface quality risks before they hurt agent performance:
- Module count — Flags when reference files exceed the 2–3 optimal range
- Complexity classification — Warns when token count + module count enters the harmful comprehensive zone
- Procedural content quality — Detects declarative-only descriptions missing action verbs and workflow structure
- Over-specificity — Catches hardcoded paths, IPs, port numbers, and instance-specific content that prevents generalization
- Negative delta risk — Identifies conflicting procedures, duplicate step blocks, and excessive constraints that empirically degrade performance
-
MCP integration checks — When a skill declares
INVOKES:, Sensei verifies MCP Tools Used tables, Prerequisites sections, CLI fallback patterns, and skill-tool name collision risks. -
Reference documentation system — Progressive disclosure via
references/directory. Detailed scoring criteria, MCP integration patterns, loop workflow docs, before/after examples, configuration guides, and test templates loaded on-demand — not stuffed into the skill body. -
Test template support — Scaffolds framework-specific trigger tests (Jest, pytest, Waza) with
shouldTriggerPromptsandshouldNotTriggerPromptsarrays matched to frontmatter.
Quick Start
# Install
git clone https://github.com/spboyer/sensei.git ~/.copilot/skills/sensei
cd ~/.copilot/skills/sensei/scripts && npm install && cd ..
# Score a skill directory
npm run tokens -- score .
# Check token budgets
npm run tokens -- check
# Run it
# In Copilot CLI: "Run sensei on my-skill-name"Notes
- Follows the Anthropic skill specification for SKILL.md frontmatter format
- MIT licensed