Skip to content

feat: add pulsating blue border automation overlay to browser agent#21173

Merged
gsquared94 merged 12 commits intogoogle-gemini:mainfrom
kunal-10-cloud:feat/browser-automation-overlay
Mar 10, 2026
Merged

feat: add pulsating blue border automation overlay to browser agent#21173
gsquared94 merged 12 commits intogoogle-gemini:mainfrom
kunal-10-cloud:feat/browser-automation-overlay

Conversation

@kunal-10-cloud
Copy link
Contributor

@kunal-10-cloud kunal-10-cloud commented Mar 4, 2026

Summary

Implement a visual automation indicator (pulsating blue border) that appears when the browser agent is actively controlling a Chrome instance. This gives clear feedback to the user that the browser is under AI control.

Details

In non-headless mode, the user needs a visual cue to avoid interfering with the automated actions. This PR injects a slight blue pulsating border around the browser viewport whenever the agent connects or navigates to a new page. It ensures safety by using pointer-events: none and avoids polluting the accessibility tree by employing aria-hidden="true" and role="presentation". The overlay respects the headless configuration as well.

Related Issues

Closes #21097

How to Validate

  1. Verify Visibility: Start the Gemini CLI and run a task that triggers the browser agent without headless mode. A blue border should visibly pulsate around the chrome window edges.
  2. Verify Navigation: Provide a specific URL navigation command via the browser agent. The border should disappear momentarily and immediately reappear once the page loads.
  3. Verify Disabling in Headless: Configure settings to start the browser in headless mode. No overlay styles should be evaluated.
  4. Attaching the screencast below for the same:
Screen.Recording.2026-03-07.at.1.04.22.PM.mov

Pre-Merge Checklist

  • Updated relevant documentation and README (if needed)
  • Added/updated tests (if needed)
  • Noted breaking changes (if any)
  • Validated on required platforms/methods:
    • MacOS
      • npm run
      • npx
      • Docker
      • Podman
      • Seatbelt
    • Windows
      • npm run
      • npx
      • Docker
    • Linux
      • npm run
      • npx
      • Docker

@kunal-10-cloud kunal-10-cloud requested a review from a team as a code owner March 4, 2026 22:11
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the user experience for the browser agent by providing a clear visual indicator when the agent is in control of a Chrome instance. By injecting a pulsating blue border, users can easily discern when automated actions are occurring, preventing accidental interference and improving transparency during non-headless operations. The implementation prioritizes user experience and accessibility by ensuring the overlay is non-intrusive.

Highlights

  • Visual Automation Indicator: Introduced a pulsating blue border overlay to visually indicate when the browser agent is actively controlling a Chrome instance in non-headless mode.
  • Accessibility and Non-Interference: Ensured the overlay is non-interactive (pointer-events: none) and does not affect accessibility (aria-hidden="true", role="presentation") to prevent user interference and maintain a clean accessibility tree.
  • Dynamic Overlay Management: Implemented automatic injection of the overlay upon browser agent connection, re-injection after navigation actions, and removal upon browser agent closure.
  • Comprehensive Testing: Added extensive unit tests to verify the overlay's behavior, including its conditional display based on headless mode and proper lifecycle management.
Changelog
  • packages/core/src/agents/browser/automationOverlay.ts
    • Added a new file defining functions to inject and remove a pulsating blue border overlay.
    • Included JavaScript snippets for dynamically adding/removing the overlay and its CSS animation to the DOM.
    • Configured the overlay with pointer-events: none, aria-hidden="true", and role="presentation" for non-interference and accessibility.
  • packages/core/src/agents/browser/browserAgentFactory.test.ts
    • Imported and mocked injectAutomationOverlay for testing purposes.
    • Added tests to confirm injectAutomationOverlay is called when not in headless mode and not called when in headless mode.
  • packages/core/src/agents/browser/browserAgentFactory.ts
    • Imported injectAutomationOverlay.
    • Added logic to conditionally call injectAutomationOverlay after browser connection if the browser is not in headless mode.
    • Updated the call to createMcpDeclarativeTools to pass the config object.
  • packages/core/src/agents/browser/browserManager.test.ts
    • Imported and mocked removeAutomationOverlay for testing.
    • Added a test to verify removeAutomationOverlay is invoked during the browser manager's cleanup process.
  • packages/core/src/agents/browser/browserManager.ts
    • Imported removeAutomationOverlay.
    • Integrated a call to removeAutomationOverlay within the close() method to ensure the overlay is cleaned up when the browser manager closes.
  • packages/core/src/agents/browser/mcpToolWrapper.test.ts
    • Imported makeFakeConfig, Config, and injectAutomationOverlay.
    • Updated calls to createMcpDeclarativeTools to pass the mockConfig object.
    • Added tests to confirm injectAutomationOverlay is re-called after navigation tools execute and is skipped in headless mode.
  • packages/core/src/agents/browser/mcpToolWrapper.ts
    • Imported Config and injectAutomationOverlay.
    • Modified createMcpDeclarativeTools to accept a config argument.
    • Implemented wrapping for navigation tools (navigate_page, new_page) to re-inject the automation overlay after successful page changes, if not in headless mode.
  • packages/core/src/agents/browser/mcpToolWrapperConfirmation.test.ts
    • Imported makeFakeConfig and Config.
    • Updated calls to createMcpDeclarativeTools to pass the mockConfig object.
Activity
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-cli gemini-cli bot added the area/agent Issues related to Core Agent, Tools, Memory, Sub-Agents, Hooks, Agent Quality label Mar 4, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a visual overlay to indicate when the browser is under automation. The implementation is well-structured, adding the overlay on agent start and after navigations, and removing it on cleanup. The changes are supported by new tests. My review includes two high-severity suggestions for improvement. The first addresses a potential robustness issue in the overlay injection script to ensure it works reliably across different web page structures. The second suggests refactoring the tool-wrapping logic to improve maintainability and avoid monkey-patching, which can be brittle.

kunal-10-cloud and others added 2 commits March 5, 2026 03:48
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@kunal-10-cloud
Copy link
Contributor Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a visual overlay to indicate when the browser is under automation control. The implementation is clean, with the overlay logic encapsulated in a new automationOverlay.ts file. The overlay is correctly injected upon agent startup and after navigation events, and it is properly removed when the browser session is closed. The feature is also correctly disabled in headless mode. The changes are well-tested and follow good practices. I have no specific concerns with this implementation.

@kunal-10-cloud
Copy link
Contributor Author

Hi @gsquared94, please review the pr once. If any changes are required, please let me know, will look into it

… evaluate_script args

- Use 'function' parameter instead of 'code' for evaluate_script MCP tool
- Replace IIFE syntax with plain arrow function (required by chrome-devtools-mcp)
- Replace <style> tag injection with Web Animations API to bypass CSP
- Use string array join instead of nested template literals for reliability
- Remove duplicate appendChild call that detached the animation
- Fix removal script to use arrow function syntax instead of broken IIFE
@kunal-10-cloud
Copy link
Contributor Author

Hi @gsquared94, I have attached the screencast as a POC in the PR description for your review. Please look into it once and let me know if any changes are required

@gsquared94
Copy link
Contributor

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a pulsating blue border overlay to visually indicate when the browser agent is actively controlling a Chrome instance in non-headless mode. The implementation is robust, considering Content Security Policies (CSP) and accessibility concerns by using Web Animations API, "pointer-events: none", "aria-hidden="true"", and "role="presentation"". The feature is well-integrated with the existing browser agent lifecycle, ensuring the overlay is injected upon connection and removed during cleanup. Comprehensive tests have been added to cover the new functionality and its conditional behavior based on the headless configuration.

@kunal-10-cloud
Copy link
Contributor Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a pulsating blue border as a visual indicator for when the browser agent is active. The implementation involves creating a new automationOverlay.ts module to inject and remove the overlay, and integrating it into the browser agent's lifecycle. The overlay is correctly injected on agent startup and removed on cleanup. The logic is also extended to re-inject the overlay after explicit navigation tool calls like navigate_page.

The code is well-structured and includes corresponding tests for the new functionality. However, I've identified a significant limitation in the current approach for re-injecting the overlay after navigation, which I've detailed in a specific comment. The current implementation only handles explicit navigation commands and misses navigations triggered by other actions, such as clicking a link.

Move overlay re-injection from NavigationalMcpDeclarativeTool wrapper
in mcpToolWrapper.ts into BrowserManager.callTool(). This fixes implicit
navigations (e.g. clicking an <a href> link) which previously caused the
overlay to disappear without being re-injected.

The previous approach only wrapped navigate_page and new_page. Clicking
a link also causes a full-page navigation, wiping the overlay, but was
not handled.

chrome-devtools-mcp is a pure request/response server and emits no MCP
notifications, so listening for page-load events via the protocol is not
possible. Intercepting at callTool() after any potentially-navigating
tool is the correct architectural equivalent.

Tools that trigger re-injection: click, click_at, navigate_page,
new_page, press_key (Enter on link/form), handle_dialog.

Changes:
- browserManager.ts: add POTENTIALLY_NAVIGATING_TOOLS set and
  shouldInjectOverlay field; re-inject overlay in callTool() after
  successful calls to navigating tools; remove removeAutomationOverlay
  from close() (unnecessary when browser is terminating)
- mcpToolWrapper.ts: delete NavigationalMcpToolInvocation and
  NavigationalMcpDeclarativeTool classes; remove injectAutomationOverlay
  import; remove unused Config param from createMcpDeclarativeTools
- browserAgentFactory.ts: drop config arg from createMcpDeclarativeTools call
- Tests: update browserManager.test.ts with 6 overlay re-injection
  scenarios; clean up mcpToolWrapper tests to remove now-covered cases
@kunal-10-cloud
Copy link
Contributor Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a helpful visual indicator for browser automation. The implementation is clean, well-tested, and considers important aspects like Content Security Policy and accessibility. I've found one potential issue where the visual indicator might not appear when switching between browser tabs, which could be confusing for the user. My feedback includes a suggestion to address this.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@kunal-10-cloud
Copy link
Contributor Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a helpful visual indicator for browser automation by adding a pulsating blue border. The implementation is solid, correctly identifying points where the overlay needs to be injected or re-injected, such as on agent startup and after potential page navigations. The use of the Web Animations API to avoid CSP issues is a thoughtful touch. My feedback focuses on improving the maintainability of the script injection logic by adopting more modern JavaScript syntax.

@kunal-10-cloud
Copy link
Contributor Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a helpful visual indicator for browser automation, showing a pulsating blue border when the agent is active. The implementation is well-done, using the Web Animations API for CSP compatibility and including appropriate accessibility attributes. The logic for injecting the overlay on agent start and re-injecting it after potential page navigations is solid and well-tested.

I have one suggestion to improve the robustness of the overlay by using a randomized ID, which will make it more difficult for websites to detect and potentially block the agent. This change will enhance the reliability of the browser agent across different web environments.

Note: Security Review is unavailable for this PR.

Comment on lines +23 to +81
const OVERLAY_ELEMENT_ID = '__gemini_automation_overlay';

/**
* Builds the JavaScript function string that injects the automation overlay.
*
* Returns a plain arrow-function expression (no trailing invocation) because
* chrome-devtools-mcp's evaluate_script tool invokes it internally.
*
* Avoids nested template literals by using string concatenation for cssText.
*/
function buildInjectionScript(): string {
return `() => {
const id = '${OVERLAY_ELEMENT_ID}';
const existing = document.getElementById(id);
if (existing) existing.remove();

const overlay = document.createElement('div');
overlay.id = id;
overlay.setAttribute('aria-hidden', 'true');
overlay.setAttribute('role', 'presentation');

Object.assign(overlay.style, {
position: 'fixed',
top: '0',
left: '0',
right: '0',
bottom: '0',
zIndex: '2147483647',
pointerEvents: 'none',
border: '6px solid rgba(66, 133, 244, 1.0)',
});

document.documentElement.appendChild(overlay);

try {
overlay.animate([
{ borderColor: 'rgba(66,133,244,0.3)', boxShadow: 'inset 0 0 8px rgba(66,133,244,0.15)' },
{ borderColor: 'rgba(66,133,244,1.0)', boxShadow: 'inset 0 0 16px rgba(66,133,244,0.5)' },
{ borderColor: 'rgba(66,133,244,0.3)', boxShadow: 'inset 0 0 8px rgba(66,133,244,0.15)' }
], { duration: 2000, iterations: Infinity, easing: 'ease-in-out' });
} catch (e) {
// Silently ignore animation errors, as they can happen on sites with strict CSP.
// The border itself is the most important visual indicator.
}

return 'overlay-injected';
}`;
}

/**
* Builds the JavaScript function string that removes the automation overlay.
*/
function buildRemovalScript(): string {
return `() => {
const el = document.getElementById('${OVERLAY_ELEMENT_ID}');
if (el) el.remove();
return 'overlay-removed';
}`;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using a fixed ID for the automation overlay makes it easily detectable by websites, which could lead to the browser agent being blocked or its behavior altered. To improve robustness and make detection harder, I suggest using a randomized ID for the overlay. This can be achieved by using a constant prefix and appending a random string.

const OVERLAY_ID_PREFIX = '__gemini_automation_overlay_';

/**
 * Builds the JavaScript function string that injects the automation overlay.
 *
 * Returns a plain arrow-function expression (no trailing invocation) because
 * chrome-devtools-mcp's evaluate_script tool invokes it internally.
 *
 * Avoids nested template literals by using string concatenation for cssText.
 */
function buildInjectionScript(): string {
  return `() => {
    const prefix = '${OVERLAY_ID_PREFIX}';
    // Remove any existing overlays to be safe.
    document.querySelectorAll(`[id^="${prefix}"]`).forEach(el => el.remove());

    const overlay = document.createElement('div');
    overlay.id = prefix + Math.random().toString(36).slice(2);
    overlay.setAttribute('aria-hidden', 'true');
    overlay.setAttribute('role', 'presentation');

    Object.assign(overlay.style, {
      position: 'fixed',
      top: '0',
      left: '0',
      right: '0',
      bottom: '0',
      zIndex: '2147483647',
      pointerEvents: 'none',
      border: '6px solid rgba(66, 133, 244, 1.0)',
    });

    document.documentElement.appendChild(overlay);

    try {
      overlay.animate([
        { borderColor: 'rgba(66,133,244,0.3)', boxShadow: 'inset 0 0 8px rgba(66,133,244,0.15)' },
        { borderColor: 'rgba(66,133,244,1.0)', boxShadow: 'inset 0 0 16px rgba(66,133,244,0.5)' },
        { borderColor: 'rgba(66,133,244,0.3)', boxShadow: 'inset 0 0 8px rgba(66,133,244,0.15)' }
      ], { duration: 2000, iterations: Infinity, easing: 'ease-in-out' });
    } catch (e) {
      // Silently ignore animation errors, as they can happen on sites with strict CSP.
      // The border itself is the most important visual indicator.
    }

    return 'overlay-injected';
  }`;
}

/**
 * Builds the JavaScript function string that removes the automation overlay.
 */
function buildRemovalScript(): string {
  return `() => {
    const prefix = '${OVERLAY_ID_PREFIX}';
    document.querySelectorAll(`[id^="${prefix}"]`).forEach(el => el.remove());
    return 'overlay-removed';
  }`;
}

Copy link
Contributor

@gsquared94 gsquared94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@gsquared94 gsquared94 enabled auto-merge March 10, 2026 19:47
@gsquared94 gsquared94 added this pull request to the merge queue Mar 10, 2026
Merged via the queue into google-gemini:main with commit 5caa192 Mar 10, 2026
27 checks passed
JaisalJain pushed a commit to JaisalJain/gemini-cli that referenced this pull request Mar 11, 2026
…oogle-gemini#21173)

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Gaurav <39389231+gsquared94@users.noreply.github.com>
liamhelmer pushed a commit to badal-io/gemini-cli that referenced this pull request Mar 12, 2026
…oogle-gemini#21173)

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Gaurav <39389231+gsquared94@users.noreply.github.com>
yashodipmore pushed a commit to yashodipmore/geemi-cli that referenced this pull request Mar 21, 2026
…oogle-gemini#21173)

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Gaurav <39389231+gsquared94@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/agent Issues related to Core Agent, Tools, Memory, Sub-Agents, Hooks, Agent Quality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Browser Agent] Visible Automation Overlay — Pulsating Blue Border

2 participants