Skip to content

Commit 500d834

Browse files
committed
Add skill marketplace and refresh README
1 parent bba846a commit 500d834

15 files changed

Lines changed: 543 additions & 111 deletions

File tree

README.md

Lines changed: 225 additions & 104 deletions
Large diffs are not rendered by default.

docs/architecture.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -241,6 +241,7 @@ class Operator(Protocol):
241241
| `state_view` | the state slice visible to this node |
242242
| `inbox` | incoming A2A messages |
243243
| `artifacts` | visible artifacts |
244+
| `skills` | skill names loaded for this operator execution |
244245
| `working_dir` | working directory |
245246
| `session_policy` | whether to create a new session, reuse one, or force resume |
246247
| `tool_policy` | allowed tools and permission level |

examples/planner_coder_reviewer.py

Lines changed: 23 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -29,12 +29,16 @@ def planner_script(_request):
2929

3030

3131
def coder_script(request):
32+
payload = json.loads(request.instruction)
3233
return [
3334
ControllerEvent(
3435
kind="message_completed",
3536
payload={
3637
"kind": "observation",
37-
"text": f"Coder received {len(request.metadata)} metadata keys and produced a draft.",
38+
"text": (
39+
f"Coder received {len(payload['skills'])} skills "
40+
f"and {len(request.metadata)} metadata keys and produced a draft."
41+
),
3842
},
3943
),
4044
ControllerEvent(
@@ -84,9 +88,24 @@ def main() -> None:
8488
graph.add_operator("coder_op", DefaultOperator("coder_op", StaticController(coder_script), role="coder"))
8589
graph.add_operator("reviewer_op", DefaultOperator("reviewer_op", StaticController(reviewer_script), role="reviewer"))
8690

87-
graph.add_node("plan", operator="planner_op", objective="Plan the work")
88-
graph.add_node("implement", operator="coder_op", objective="Implement the work")
89-
graph.add_node("review", operator="reviewer_op", objective="Review the work")
91+
graph.add_node(
92+
"plan",
93+
operator="planner_op",
94+
objective="Plan the work",
95+
skills=["research-paper-search", "literature-synthesis"],
96+
)
97+
graph.add_node(
98+
"implement",
99+
operator="coder_op",
100+
objective="Implement the work",
101+
skills=["experiment-planning", "dataset-triage"],
102+
)
103+
graph.add_node(
104+
"review",
105+
operator="reviewer_op",
106+
objective="Review the work",
107+
skills=["citation-audit", "result-audit"],
108+
)
90109
graph.add_edge("plan", "implement")
91110
graph.add_edge("implement", "review")
92111

skills/README.md

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
# AgentWorld Skills
2+
3+
AgentWorld supports a repository-local skill marketplace under `skills/`.
4+
5+
Each skill is a reusable package of execution guidance that can be loaded by one operator node without forcing the same behavior onto every other node in the graph.
6+
7+
## Structure
8+
9+
Each skill usually lives in its own folder:
10+
11+
```text
12+
skills/
13+
└── skill-name/
14+
├── SKILL.md
15+
├── references/
16+
├── scripts/
17+
└── assets/
18+
```
19+
20+
Only `SKILL.md` is required. The other folders are optional.
21+
22+
## Included Skills
23+
24+
| Skill | Purpose |
25+
| --- | --- |
26+
| `research-paper-search` | paper discovery, identifier collection, source triage |
27+
| `literature-synthesis` | theme extraction, contradiction mapping, gap analysis |
28+
| `citation-audit` | bibliography validation, metadata checks, citation hygiene |
29+
| `experiment-planning` | executable research plans, deliverables, validation gates |
30+
| `result-audit` | output review, unsupported claim detection, evidence checks |
31+
32+
## Usage
33+
34+
Attach skills directly to a graph node:
35+
36+
```python
37+
graph.add_node(
38+
"plan",
39+
operator="planner",
40+
objective="Search the literature and frame the study",
41+
skills=["research-paper-search", "literature-synthesis"],
42+
)
43+
```
44+
45+
The runtime injects the selected skill names into the operator request, so the operator can build prompts and execution context for that specific node.
46+
47+
## Notes
48+
49+
- Skills are public project assets and should stay in English
50+
- Skills should be specific enough to be reusable, but not so narrow that they only fit one task
51+
- If a skill needs code or references, keep them inside the skill folder
52+
- Private working notes belong in ignored local files, not in `skills/`

skills/citation-audit/SKILL.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
---
2+
name: citation-audit
3+
description: Use when an operator must verify citation accuracy, bibliography consistency, metadata quality, and whether claims are actually supported by the cited sources.
4+
---
5+
6+
# Citation Audit
7+
8+
## When to Use This Skill
9+
10+
Use this skill when the node needs to:
11+
12+
- verify that citations support the written claim
13+
- check metadata such as title, authors, venue, year, and DOI
14+
- detect missing or duplicate references
15+
- clean up a bibliography before release
16+
- review whether a writeup overstates what a cited paper proves
17+
18+
## Workflow
19+
20+
1. Check whether every major claim has a citation.
21+
2. For each citation, confirm that the source actually supports the stated claim.
22+
3. Normalize key metadata fields.
23+
4. Remove duplicates and weak citations.
24+
5. Flag unsupported or overstated claims.
25+
26+
## Expected Outputs
27+
28+
- a citation issue list
29+
- corrected metadata suggestions
30+
- unsupported claim warnings
31+
- a cleaned bibliography checklist
32+
33+
## Quality Rules
34+
35+
- citation presence is not enough; support must be real
36+
- prefer primary papers over derivative references
37+
- flag ambiguous or second-hand citations
38+
- keep corrections explicit and actionable
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
---
2+
name: experiment-planning
3+
description: Use when an operator needs to convert a research objective into an executable plan with deliverables, checkpoints, dependencies, and validation criteria.
4+
---
5+
6+
# Experiment Planning
7+
8+
## When to Use This Skill
9+
10+
Use this skill when the node needs to:
11+
12+
- turn a vague research goal into a concrete plan
13+
- define inputs, outputs, metrics, and success criteria
14+
- break work into stages for planner, coder, and reviewer nodes
15+
- identify risky steps before execution begins
16+
- decide what evidence is needed to claim success
17+
18+
## Workflow
19+
20+
1. Rewrite the objective as a measurable question.
21+
2. List required inputs, tools, datasets, and dependencies.
22+
3. Break the task into stages with expected deliverables.
23+
4. Add validation gates for each stage.
24+
5. Mark high-risk assumptions and fallback paths.
25+
26+
## Expected Outputs
27+
28+
- a staged execution plan
29+
- a deliverable checklist
30+
- a risk register
31+
- validation checkpoints
32+
33+
## Quality Rules
34+
35+
- plans should be executable, not aspirational
36+
- every stage should produce an inspectable artifact
37+
- note blockers early instead of hiding them in later steps
38+
- define what counts as success before running the work
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
---
2+
name: literature-synthesis
3+
description: Use when an operator already has a paper set and needs to extract themes, claims, contradictions, methods, limitations, and open gaps.
4+
---
5+
6+
# Literature Synthesis
7+
8+
## When to Use This Skill
9+
10+
Use this skill when the node needs to:
11+
12+
- summarize a set of related papers
13+
- compare methods and assumptions
14+
- identify agreement and disagreement across studies
15+
- map limitations and open questions
16+
- produce a structured literature review instead of isolated summaries
17+
18+
## Workflow
19+
20+
1. Group papers by method, dataset, or research question.
21+
2. Extract the claim, method, evidence, and limitation for each paper.
22+
3. Merge repeated findings into themes.
23+
4. Highlight contradictions explicitly and explain what differs.
24+
5. End with a gap map: what is still untested, weakly supported, or missing.
25+
26+
## Expected Outputs
27+
28+
- a theme table
29+
- a method comparison block
30+
- a contradiction list
31+
- a research gap summary
32+
33+
## Quality Rules
34+
35+
- avoid turning one paper into a field-wide claim
36+
- always separate result from interpretation
37+
- keep limitations close to the claim they weaken
38+
- if evidence is thin, say so directly
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
---
2+
name: research-paper-search
3+
description: Use when an operator needs to find relevant papers, databases, identifiers, benchmark references, or primary-source evidence before planning or implementation.
4+
---
5+
6+
# Research Paper Search
7+
8+
## When to Use This Skill
9+
10+
Use this skill when the node needs to:
11+
12+
- find primary papers for a topic
13+
- identify canonical baselines or benchmark papers
14+
- collect DOI, arXiv, PMID, or project URLs
15+
- narrow a broad problem into a tractable evidence set
16+
- distinguish primary sources from commentary or summaries
17+
18+
## Workflow
19+
20+
1. Start from the task objective and convert it into 2-4 search queries.
21+
2. Prefer primary sources: papers, official datasets, benchmark repos, and technical documentation.
22+
3. Capture identifiers early: title, year, venue, DOI, arXiv id, project URL.
23+
4. Separate "must-read" papers from "background only" papers.
24+
5. Return a short evidence map instead of a loose list.
25+
26+
## Expected Outputs
27+
28+
- a ranked paper list
29+
- a baseline list
30+
- a source table with identifiers
31+
- unresolved search gaps that require another pass
32+
33+
## Quality Rules
34+
35+
- prefer primary sources over blog summaries
36+
- note publication year and venue whenever available
37+
- call out uncertainty when a result is only weakly relevant
38+
- do not claim a benchmark or baseline is standard without evidence

skills/result-audit/SKILL.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
---
2+
name: result-audit
3+
description: Use when an operator must review outputs for unsupported conclusions, weak evidence, missing ablations, broken assumptions, or incomplete reporting.
4+
---
5+
6+
# Result Audit
7+
8+
## When to Use This Skill
9+
10+
Use this skill when the node needs to:
11+
12+
- review a report, notebook, or artifact bundle
13+
- detect unsupported conclusions
14+
- check whether the evidence matches the claimed contribution
15+
- flag missing baselines, controls, or ablations
16+
- judge whether the final output is ready for handoff or publication
17+
18+
## Workflow
19+
20+
1. Read the stated claim before the evidence.
21+
2. Map each important claim to the supporting artifact or result.
22+
3. Check for missing comparisons, controls, or caveats.
23+
4. Identify where the output overstates certainty.
24+
5. Return findings as prioritized review items.
25+
26+
## Expected Outputs
27+
28+
- a severity-ranked review list
29+
- unsupported claim flags
30+
- missing evidence or baseline flags
31+
- final release recommendation
32+
33+
## Quality Rules
34+
35+
- focus on evidence, not writing style
36+
- do not accept "looks plausible" as support
37+
- call out missing baselines explicitly
38+
- separate hard failures from optional improvements

src/agentworld/graph/builder.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,10 +55,13 @@ def add_node(
5555
objective: str | None = None,
5656
role: str | None = None,
5757
input_selector: StateSelector | None = None,
58+
skills: Sequence[str] | None = None,
5859
metadata: dict[str, Any] | None = None,
5960
) -> "AgentGraph":
6061
if name in self.nodes:
6162
raise ValueError(f"Node already exists: {name}")
63+
resolved_metadata = dict(metadata or {})
64+
resolved_skills = [str(skill) for skill in (skills or resolved_metadata.pop("skills", []))]
6265
self.nodes[name] = GraphNode(
6366
name=name,
6467
kind=kind,
@@ -67,7 +70,8 @@ def add_node(
6770
objective=objective,
6871
role=role,
6972
input_selector=input_selector,
70-
metadata=dict(metadata or {}),
73+
skills=resolved_skills,
74+
metadata=resolved_metadata,
7175
)
7276
return self
7377

0 commit comments

Comments
 (0)