Conversation
Summary of ChangesHello @joshualitt, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a new evaluation suite for the generalist agent, ensuring it correctly delegates complex tasks while handling simpler ones directly. It also significantly enhances the Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Size Change: +1.35 kB (+0.01%) Total Size: 25.7 MB
ℹ️ View Unchanged
|
There was a problem hiding this comment.
Code Review
The pull request successfully enables the generalist agent and updates the system prompts to encourage strategic orchestration and delegation. It also introduces significant improvements to the AppRig test utility for better state tracking and deterministic event waiting. However, there are several debug console.log statements and commented-out code blocks in the core packages that should be removed before merging. Additionally, a small cleanup is needed in the test rig to prevent a potential memory leak in the test environment.
3ddd835 to
6ddf758
Compare
6ddf758 to
05050f2
Compare
05050f2 to
b4244b1
Compare
| Sub-agents are specialized expert agents. Each sub-agent is available as a tool of the same name. You MUST delegate tasks to the sub-agent with the most relevant expertise. | ||
|
|
||
| ### Strategic Orchestration & Delegation | ||
| Operate as a **strategic orchestrator**. Your own context window is your most precious resource. Every turn you take adds to the permanent session history. To keep the session fast and efficient, use sub-agents to "compress" complex or repetitive work. |
There was a problem hiding this comment.
strategic orchestrator
One concern I'd have is that we're giving the agent two slightly unrelated roles. It's both a specialized SDE agent and an orchestrator. This sort of competing priority can lead to inconsistent behavior.
There was a problem hiding this comment.
This is definitely something to watch closely. We can easily revert this though if it gives us any grief.
| When you delegate, the sub-agent's entire execution is consolidated into a single summary in your history, keeping your main loop lean. | ||
|
|
||
| **High-Impact Delegation Candidates:** | ||
| - **Repetitive Batch Tasks:** Tasks involving more than 3 files or repeated steps (e.g., "Add license headers to all files in src/", "Fix all lint errors in the project"). |
There was a problem hiding this comment.
Is it worth clarifying that we might want to break large bodies of work into several batches or is that refinement for later?
There was a problem hiding this comment.
I think the benefits of decomposition really come out when subagents can run in parallel. Perhaps it makes sense to tackle that aspect holistically and on it's own vs. trying to do it alongside just enabling the generalist agent? But yes, we should circle back after enabling this.
| description: | ||
| "A general-purpose AI agent with access to all tools. Use it for complex tasks that don't fit into other specialized agents.", | ||
| experimental: true, | ||
| 'A general-purpose AI agent with access to all tools. Highly recommended for tasks that are turn-intensive or involve processing large amounts of data. Use this to keep the main session history lean and efficient. Excellent for: batch refactoring/error fixing across multiple files, running commands with high-volume output, and speculative investigations.', |
There was a problem hiding this comment.
non-blocking: One thing we should consider is the impact on gradual accumlation of context. For example: I frequently use Gemini CLI to do data analysis by having it explore the logs and gradually build up and understanding, before finally answering the question. I wonder if we want to somehow preserve that use case.
There was a problem hiding this comment.
Agreed, maybe this falls out of memory work? let's think this through carefully.
b4244b1 to
7d69a7c
Compare
4a78a96 to
0bf3a01
Compare
7d69a7c to
987b9d7
Compare
Fixes #16858