-
Notifications
You must be signed in to change notification settings - Fork 13.6k
Add behavioral evals for the memory subagent #22805
Copy link
Copy link
Closed
Labels
area/agentIssues related to Core Agent, Tools, Memory, Sub-Agents, Hooks, Agent QualityIssues related to Core Agent, Tools, Memory, Sub-Agents, Hooks, Agent Qualityworkstream-rollupLabel used to tag epics and features that are associated with one of the three primary workstreamsLabel used to tag epics and features that are associated with one of the three primary workstreams🔒 maintainer only⛔ Do not contribute. Internal roadmap item.⛔ Do not contribute. Internal roadmap item.
Metadata
Metadata
Assignees
Labels
area/agentIssues related to Core Agent, Tools, Memory, Sub-Agents, Hooks, Agent QualityIssues related to Core Agent, Tools, Memory, Sub-Agents, Hooks, Agent Qualityworkstream-rollupLabel used to tag epics and features that are associated with one of the three primary workstreamsLabel used to tag epics and features that are associated with one of the three primary workstreams🔒 maintainer only⛔ Do not contribute. Internal roadmap item.⛔ Do not contribute. Internal roadmap item.
Type
Fields
Give feedbackNo fields configured for Task.
The memory subagent (from #22716) needs evals to make sure it actually does the right thing and doesn't regress as we iterate. The design doc calls out several specific behaviors we need to lock in:
~/.gemini/, project knowledge goes to.gemini/. The doc specifically flags this as needing evals.These should live in
evals/alongside existing behavioral evals. At least one eval per category.