(feat): Add a ToDo tool to track ongoing task lists#8761
Conversation
This is a tool for the model to track what it needs to do. The todos are provided by the model and this tool is for it to communicate the updated status of that list. It doesn't need to persist it anywhere since it is in the history. Yes, this will be part of the core tools. |
scidomino
left a comment
There was a problem hiding this comment.
It's weird that a noop tool would improve performance but I assume you have run evals and shown that this improves things.
I think it's primarily a way to: I wonder if just adding instructions for a) and b) to the system prompt could yield a similar performance impact. @anj-s wdyt? |
Its not a noop tool as explained above. This helps the model create a list of items and track it. yes, this improves evals and is a known method for doing so. |
We have this in the system prompt but its not something the model does consistently and does not involve the model updating the plan list at every turn. We ideally want the todo list to be the only plan list that the model is tracking |
Co-authored-by: gemini-cli-robot <gemini-cli-robot@google.com>
Co-authored-by: joshualitt <joshualitt@google.com> Co-authored-by: Tommaso Sciortino <sciortino@gmail.com> Co-authored-by: matt korwel <matt.korwel@gmail.com> Co-authored-by: gemini-cli-robot <gemini-cli-robot@google.com> Co-authored-by: Jacob MacDonald <jakemac@google.com> Co-authored-by: Shreya Keshive <skeshive@gmail.com>
Co-authored-by: joshualitt <joshualitt@google.com> Co-authored-by: Tommaso Sciortino <sciortino@gmail.com> Co-authored-by: matt korwel <matt.korwel@gmail.com> Co-authored-by: gemini-cli-robot <gemini-cli-robot@google.com> Co-authored-by: Jacob MacDonald <jakemac@google.com> Co-authored-by: Shreya Keshive <skeshive@gmail.com>
Co-authored-by: joshualitt <joshualitt@google.com> Co-authored-by: Tommaso Sciortino <sciortino@gmail.com> Co-authored-by: matt korwel <matt.korwel@gmail.com> Co-authored-by: gemini-cli-robot <gemini-cli-robot@google.com> Co-authored-by: Jacob MacDonald <jakemac@google.com> Co-authored-by: Shreya Keshive <skeshive@gmail.com>
TLDR
This PR introduces a new
write_todos_listtool that allows the agent to create and manage a checklist of tasks for complex user requests. This helps the agent track its progress, organize its work, and provides the user with visibility into the agent's plan.Dive Deeper
The
write_todos_listtool is a declarative tool that enables the agent to manage a list of tasks with the following statuses:pending,in_progress,completed, andcancelled. The agent is guided by an updated system prompt on when and how to use this tool, with a focus on using it for complex, multi-step tasks and avoiding it for simple requests.The tool is enabled by a
useWriteTodosflag in the configuration. The implementation includes the tool itself, along with comprehensive unit tests to ensure its functionality and validation logic are working correctly.Reviewer Test Plan
To test this feature, you can enable the
useWriteTodosflag in your settings and give the agent a complex task. Here are a few examples:Create a new feature:
add a new feature to the CLI that allows users to configure the output format of the response.add a new configuration option,implement the logic to format the output,add tests for the new feature, etc.Build a simple application:
create a simple web app that uses the Gemini API to answer questions.Debug an issue:
The application is crashing when I try to upload a file. Can you help me debug and fix the issue?reproduce the crash,examine the logs,identify the root cause,implement a fix, andverify the fix.Fixes #4580
Testing Matrix
Linked issues / bugs