Closed
Conversation
Adds a skill that gives the agent a real Chrome browser through a Docker sidecar running Chromium with CDP. The browser keeps cookies and logins across conversations. Includes an MCP server (browser-mcp.ts) that wraps chrome-agent as 17 browser tools: navigate, snapshot, click, type, select, check, screenshot, back/forward/reload, wait, tabs, and cookies. The sidecar also exposes noVNC on port 6080 for manual login when needed (CAPTCHAs, OAuth flows). Tested against a live sidecar with 16 passing integration tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Collaborator
|
@gavrielc — Skill-only PR adding Chrome browser via Docker sidecar with CDP tools. Already labeled. Worth checking how it relates to the existing |
This was referenced Mar 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this does
Adds a skill that gives the agent a persistent Chrome browser through a Docker sidecar. The agent can navigate pages, read them via accessibility snapshots, click elements, fill forms, take screenshots, and manage tabs and cookies. Logins survive across conversations because the sidecar uses a Docker volume for Chrome's user-data-dir.
The skill creates an MCP server (browser-mcp.ts) inside the agent container that talks to the sidecar over CDP. It exposes 17 tools: navigate, snapshot, click, type, select, check, screenshot, back, forward, reload, wait, tabs (list/new/switch/close), and cookies (export/import).
noVNC on port 6080 lets you see the browser or log in manually when the agent hits a CAPTCHA or OAuth wall.
What's in the PR
One file:
.claude/skills/add-chrome-browser/SKILL.mdNo source changes. The skill contains the instructions Claude Code follows to add the feature, per CONTRIBUTING.md.
The library
chrome-agent is a standalone CDP library that produces text-based accessibility snapshots with clickable refs (@e1, @e2). It was tested against a live sidecar with 16 passing integration tests (8 for the library, 8 for the MCP server).
Test plan
tsc --noEmitpasses with zero errors