Skip to content

Releases: spboyer/sensei

v1.4.0 — GEPA Integration

25 Mar 21:06

Choose a tag to compare

What's New

GEPA Integration — Evolutionary Skill Optimization 🧬

Sensei now supports GEPA (Genetic-Pareto) evolutionary optimization as an optional enhancement to the Ralph loop.

When invoked with --gepa, sensei replaces its template-based improve step with LLM-driven evolutionary optimization that:

  • Auto-discovers the skill's test harness at runtime
  • Builds an evaluator that scores candidates on content quality + trigger accuracy
  • Proposes improvements via LLM (GitHub Models), keeping only versions that score higher
  • Feeds test failures as ASI so the LLM knows why a candidate failed

Usage

Run sensei on my-skill --gepa
Run sensei score my-skill

Results on GitHub Copilot for Azure skills

Skill Before After
azure-storage 0.16 1.00
entra-app-registration 0.38 1.00
microsoft-foundry 0.50 1.00
azure-deploy 0.62 1.00

Other Improvements

  • Docs alignment — Scoring criteria, anti-trigger guidance, and examples now consistent across SKILL.md, README.md, and AGENTS.md
  • Safer frontmatter parsing — Graceful handling of malformed YAML (no more crashes)
  • Better trigger extraction — Handles apostrophes, backtick template literals, and strips JS comments
  • CI-friendly exit codes — Non-zero exit on errors for pipeline reliability
  • Python dependency spec — Added requirements.txt with pinned GEPA version

Acknowledgments

Thanks to Pamela Fox for the pointer to GEPA and optimize_anything — a great fit for iterative skill improvement.

Full Changelog: v1.3.0...v1.4.0

v1.0.0

18 Feb 18:22
e5eac2f

Choose a tag to compare

Sensei is a skill for improving agent skill frontmatter compliance. It scores, diagnoses, and iteratively fixes SKILL.md files so agents route to the right skill at the right time.

Features

  • Ralph loop pattern — Iterative read → score → improve → verify → check tokens → summarize cycle. Runs up to 5 iterations per skill until Medium-High compliance is reached, then prompts: commit, create issue, or skip.

  • Compliance scoring — Rule-based evaluation across four levels: Low → Medium → Medium-High → High. Checks description length, trigger phrases (USE FOR:), anti-triggers (DO NOT USE FOR:), and routing clarity (INVOKES: / FOR SINGLE OPERATIONS:).

  • Token CLI — Five subcommands for managing skill token budgets:

    • count — Token counts for any markdown files
    • check — Validate files against configurable soft/hard limits
    • suggest — Optimization recommendations
    • compare — Git-based token delta between refs
    • score — Advisory scoring checks on a skill directory
  • Advisory scoring checks — Five built-in intelligence checks that surface quality risks before they hurt agent performance:

    • Module count — Flags when reference files exceed the 2–3 optimal range
    • Complexity classification — Warns when token count + module count enters the harmful comprehensive zone
    • Procedural content quality — Detects declarative-only descriptions missing action verbs and workflow structure
    • Over-specificity — Catches hardcoded paths, IPs, port numbers, and instance-specific content that prevents generalization
    • Negative delta risk — Identifies conflicting procedures, duplicate step blocks, and excessive constraints that empirically degrade performance
  • MCP integration checks — When a skill declares INVOKES:, Sensei verifies MCP Tools Used tables, Prerequisites sections, CLI fallback patterns, and skill-tool name collision risks.

  • Reference documentation system — Progressive disclosure via references/ directory. Detailed scoring criteria, MCP integration patterns, loop workflow docs, before/after examples, configuration guides, and test templates loaded on-demand — not stuffed into the skill body.

  • Test template support — Scaffolds framework-specific trigger tests (Jest, pytest, Waza) with shouldTriggerPrompts and shouldNotTriggerPrompts arrays matched to frontmatter.

Quick Start

# Install
git clone https://github.com/spboyer/sensei.git ~/.copilot/skills/sensei
cd ~/.copilot/skills/sensei/scripts && npm install && cd ..

# Score a skill directory
npm run tokens -- score .

# Check token budgets
npm run tokens -- check

# Run it
# In Copilot CLI: "Run sensei on my-skill-name"

Notes