fix(evals): correct misplaced params in answer-vs-act and tool_output_masking by ishaan-arora-1 · Pull Request #23788 · google-gemini/gemini-cli

ishaan-arora-1 · 2026-03-25T13:56:32Z

Found two eval files where config properties were at the wrong nesting level, causing them to be silently ignored.

answer-vs-act.eval.ts — timeout was inside params instead of being a top-level EvalCase property. evalTest passes evalCase.timeout to rig.run(), not anything from params, so the 20s timeout was never applied (defaulted to 300s).

tool_output_masking.eval.ts — security was directly in params instead of under settings. TestRig.setup() passes options.settings to _createSettingsFile, so the folder trust config was never written. (Compare with hierarchical_memory.eval.ts which does this correctly.)

Both slipped through because EvalCase.params had [key: string]: unknown, accepting any key without complaint. Replaced the index signature with the specific keys TestRig.setup() actually reads: settings, state, fakeResponsesPath.

google-cla · 2026-03-25T13:56:43Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

gemini-code-assist · 2026-03-25T13:56:51Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses critical configuration issues in two evaluation files, answer-vs-act.eval.ts and tool_output_masking.eval.ts, where key properties were misplaced, leading to their silent omission during test execution. By correcting the nesting of timeout and security parameters and refining the EvalCase.params type definition, the changes ensure that evaluation configurations are correctly applied and prevent future misconfigurations from going unnoticed, improving the reliability of the evaluation suite.

Highlights

Corrected timeout placement in answer-vs-act.eval.ts: The timeout property was moved from being incorrectly nested within params to its proper top-level EvalCase position, ensuring the 20-second timeout is now applied instead of defaulting to 300 seconds.
Adjusted security nesting in tool_output_masking.eval.ts: The security configuration, specifically folderTrust, was moved to be correctly nested under params.settings, resolving an issue where folder trust settings were not being applied during test setup.
Refined EvalCase.params type definition: The generic [key: string]: unknown index signature was replaced with specific keys (settings, state, fakeResponsesPath) in evals/test-helper.ts to prevent similar misconfigurations from being silently ignored in the future by enforcing stricter type checking.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request refactors the structure of evaluation parameters within the evals directory. Key changes include moving the timeout parameter to the top-level of EvalCase, adding state and fakeResponsesPath to the params object, and nesting security settings under a new settings property within params. A high-severity concern was raised regarding the settings type within EvalCase's params, which is currently too loose (Record<string, unknown>). It is recommended to tighten this type, potentially by importing Settings from packages/cli/src/config/settingsSchema.ts, to enhance type safety and prevent future configuration-related bugs.

gemini-code-assist · 2026-03-25T13:59:44Z

evals/test-helper.ts

@@ -266,7 +266,8 @@ export interface EvalCase {
  name: string;
  params?: {
    settings?: ForbiddenToolSettings & Record<string, unknown>;


The type for settings is ForbiddenToolSettings & Record<string, unknown>, which is very loose. This is the same kind of issue that allowed the bugs this PR is fixing. The Record<string, unknown> allows any property, which can lead to typos or misplaced configuration going unnoticed by the type checker.

To improve type safety and prevent future issues, consider tightening this type. Ideally, it should be based on the canonical settings schema. If it's possible to import the Settings type from packages/cli/src/config/settingsSchema.ts, changing the type to ForbiddenToolSettings & Settings would provide much stronger validation at compile time. This would ensure that the settings objects used in evaluations are structurally correct.

References

The comment suggests tightening a type to ForbiddenToolSettings & Settings based on settingsSchema.ts, which directly ensures the TypeScript type aligns with the canonical settings schema for stronger validation.

The comment advocates for using the canonical settings schema as the source of truth for defining types, which prevents issues like loose Record<string, unknown> types and reinforces the schema's role as the single source of truth for configuration structure.

ishaan-arora-1 · 2026-03-26T00:13:46Z

The Settings type lives in packages/cli/src/config/settingsSchema.ts — a package the evals don't depend on (evals/test-helper.ts only imports from @google/gemini-cli-core and @google/gemini-cli-test-utils). Importing it directly would create a cross-package dependency that doesn't exist today.

Additionally, TestRig.setup() in @google/gemini-cli-test-utils itself defines settings as Record<string, unknown>, so our type here is already as strict as the downstream consumer.

The right path is a follow-up that re-exports Settings from @google/gemini-cli-test-utils (which already depends on core), then tightens both TestRig.setup() and EvalCase.params.settings together. Happy to file that issue and pick it up.

…_masking `answer-vs-act.eval.ts` had `timeout` inside `params` where it's ignored — moved it to the top-level `EvalCase` property. `tool_output_masking.eval.ts` had `security` directly in `params` instead of nested under `settings`, so folder trust config was never written to the settings file. Also tightened the `EvalCase.params` type to only accept keys that `TestRig.setup()` actually reads (`settings`, `state`, `fakeResponsesPath`), replacing the permissive index signature that allowed these bugs to compile silently. Fixes google-gemini#23787

gemini-cli · 2026-04-09T03:05:03Z

Hi there! Thank you for your interest in contributing to Gemini CLI.

To ensure we maintain high code quality and focus on our prioritized roadmap, we have updated our contribution policy (see Discussion #17383).

We only guarantee review and consideration of pull requests for issues that are explicitly labeled as 'help wanted'. All other community pull requests are subject to closure after 14 days if they do not align with our current focus areas. For this reason, we strongly recommend that contributors only submit pull requests against issues explicitly labeled as 'help-wanted'.

This pull request is being closed as it has been open for 14 days without a 'help wanted' designation. We encourage you to find and contribute to existing 'help wanted' issues in our backlog! Thank you for your understanding and for being part of our community!

ishaan-arora-1 requested a review from a team as a code owner March 25, 2026 13:56

gemini-code-assist bot reviewed Mar 25, 2026

View reviewed changes

github-actions bot mentioned this pull request Mar 26, 2026

📊 AI CLI 工具社区动态日报 2026-03-26 gsscsd/big_model_radar#96

Open

ishaan-arora-1 force-pushed the fix/eval-misplaced-params branch from 556f8e9 to 359bc03 Compare March 29, 2026 11:15

gemini-cli bot closed this Apr 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(evals): correct misplaced params in answer-vs-act and tool_output_masking#23788

fix(evals): correct misplaced params in answer-vs-act and tool_output_masking#23788
ishaan-arora-1 wants to merge 1 commit intogoogle-gemini:mainfrom
ishaan-arora-1:fix/eval-misplaced-params

ishaan-arora-1 commented Mar 25, 2026

Uh oh!

google-cla bot commented Mar 25, 2026

Uh oh!

gemini-code-assist bot commented Mar 25, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 25, 2026

Uh oh!

ishaan-arora-1 commented Mar 26, 2026 •

edited

Loading

Uh oh!

gemini-cli bot commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ishaan-arora-1 commented Mar 25, 2026

Uh oh!

google-cla bot commented Mar 25, 2026

Uh oh!

gemini-code-assist bot commented Mar 25, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

ishaan-arora-1 commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-cli bot commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ishaan-arora-1 commented Mar 26, 2026 •

edited

Loading