Releases · spboyer/sensei

What's New

GEPA Integration — Evolutionary Skill Optimization 🧬

Sensei now supports GEPA (Genetic-Pareto) evolutionary optimization as an optional enhancement to the Ralph loop.

When invoked with --gepa, sensei replaces its template-based improve step with LLM-driven evolutionary optimization that:

Auto-discovers the skill's test harness at runtime

Builds an evaluator that scores candidates on content quality + trigger accuracy

Proposes improvements via LLM (GitHub Models), keeping only versions that score higher

Feeds test failures as ASI so the LLM knows why a candidate failed

Usage

Run sensei on my-skill --gepa
Run sensei score my-skill

Results on GitHub Copilot for Azure skills

Skill	Before	After
azure-storage	0.16	1.00
entra-app-registration	0.38	1.00
microsoft-foundry	0.50	1.00
azure-deploy	0.62	1.00

Skill

Before

After

azure-storage

0.16

1.00

entra-app-registration

0.38

1.00

microsoft-foundry

0.50

1.00

azure-deploy

0.62

1.00

Other Improvements

Docs alignment — Scoring criteria, anti-trigger guidance, and examples now consistent across SKILL.md, README.md, and AGENTS.md

Safer frontmatter parsing — Graceful handling of malformed YAML (no more crashes)

Better trigger extraction — Handles apostrophes, backtick template literals, and strips JS comments

CI-friendly exit codes — Non-zero exit on errors for pipeline reliability

Python dependency spec — Added requirements.txt with pinned GEPA version

Acknowledgments

Thanks to Pamela Fox for the pointer to GEPA and optimize_anything — a great fit for iterative skill improvement.

Sensei is a skill for improving agent skill frontmatter compliance. It scores, diagnoses, and iteratively fixes SKILL.md files so agents route to the right skill at the right time.

Features

Ralph loop pattern — Iterative read → score → improve → verify → check tokens → summarize cycle. Runs up to 5 iterations per skill until Medium-High compliance is reached, then prompts: commit, create issue, or skip.
Compliance scoring — Rule-based evaluation across four levels: Low → Medium → Medium-High → High. Checks description length, trigger phrases (USE FOR:), anti-triggers (DO NOT USE FOR:), and routing clarity (INVOKES: / FOR SINGLE OPERATIONS:).
Token CLI — Five subcommands for managing skill token budgets:
- count — Token counts for any markdown files
- check — Validate files against configurable soft/hard limits
- suggest — Optimization recommendations
- compare — Git-based token delta between refs
- score — Advisory scoring checks on a skill directory
Advisory scoring checks — Five built-in intelligence checks that surface quality risks before they hurt agent performance:
- Module count — Flags when reference files exceed the 2–3 optimal range
- Complexity classification — Warns when token count + module count enters the harmful comprehensive zone
- Procedural content quality — Detects declarative-only descriptions missing action verbs and workflow structure
- Over-specificity — Catches hardcoded paths, IPs, port numbers, and instance-specific content that prevents generalization
- Negative delta risk — Identifies conflicting procedures, duplicate step blocks, and excessive constraints that empirically degrade performance
MCP integration checks — When a skill declares INVOKES:, Sensei verifies MCP Tools Used tables, Prerequisites sections, CLI fallback patterns, and skill-tool name collision risks.
Reference documentation system — Progressive disclosure via references/ directory. Detailed scoring criteria, MCP integration patterns, loop workflow docs, before/after examples, configuration guides, and test templates loaded on-demand — not stuffed into the skill body.
Test template support — Scaffolds framework-specific trigger tests (Jest, pytest, Waza) with shouldTriggerPrompts and shouldNotTriggerPrompts arrays matched to frontmatter.

Quick Start

# Install
git clone https://github.com/spboyer/sensei.git ~/.copilot/skills/sensei
cd ~/.copilot/skills/sensei/scripts && npm install && cd ..

# Score a skill directory
npm run tokens -- score .

# Check token budgets
npm run tokens -- check

# Run it
# In Copilot CLI: "Run sensei on my-skill-name"

Notes

Follows the Anthropic skill specification for SKILL.md frontmatter format
MIT licensed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's New

GEPA Integration — Evolutionary Skill Optimization 🧬

Usage

Results on GitHub Copilot for Azure skills

Other Improvements

Acknowledgments

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Features

Quick Start

Notes

Uh oh!

Releases: spboyer/sensei

v1.4.0 — GEPA Integration

What's New

GEPA Integration — Evolutionary Skill Optimization 🧬

Usage

Results on GitHub Copilot for Azure skills

Other Improvements

Acknowledgments

Uh oh!

v1.0.0

Features

Quick Start

Notes

Uh oh!