Skip to content

skill: add-chrome-browser#276

Closed
kxbnb wants to merge 1 commit intoqwibitai:mainfrom
kxbnb:add-chrome-browser
Closed

skill: add-chrome-browser#276
kxbnb wants to merge 1 commit intoqwibitai:mainfrom
kxbnb:add-chrome-browser

Conversation

@kxbnb
Copy link
Copy Markdown

@kxbnb kxbnb commented Feb 16, 2026

What this does

Adds a skill that gives the agent a persistent Chrome browser through a Docker sidecar. The agent can navigate pages, read them via accessibility snapshots, click elements, fill forms, take screenshots, and manage tabs and cookies. Logins survive across conversations because the sidecar uses a Docker volume for Chrome's user-data-dir.

The skill creates an MCP server (browser-mcp.ts) inside the agent container that talks to the sidecar over CDP. It exposes 17 tools: navigate, snapshot, click, type, select, check, screenshot, back, forward, reload, wait, tabs (list/new/switch/close), and cookies (export/import).

noVNC on port 6080 lets you see the browser or log in manually when the agent hits a CAPTCHA or OAuth wall.

What's in the PR

One file: .claude/skills/add-chrome-browser/SKILL.md

No source changes. The skill contains the instructions Claude Code follows to add the feature, per CONTRIBUTING.md.

The library

chrome-agent is a standalone CDP library that produces text-based accessibility snapshots with clickable refs (@e1, @e2). It was tested against a live sidecar with 16 passing integration tests (8 for the library, 8 for the MCP server).

Test plan

  • chrome-agent: 8/8 tests pass against live Docker sidecar (connect, navigate, snapshot, screenshot, click, cookies, history, tabs)
  • browser-mcp: 8/8 tests pass via MCP protocol (list tools, navigate, snapshot, screenshot, tabs, click, cookies, error handling)
  • Skill applied to fresh nanoclaw clone: tsc --noEmit passes with zero errors
  • Run skill on a fresh fork and verify the agent can browse

Adds a skill that gives the agent a real Chrome browser through a Docker
sidecar running Chromium with CDP. The browser keeps cookies and logins
across conversations. Includes an MCP server (browser-mcp.ts) that wraps
chrome-agent as 17 browser tools: navigate, snapshot, click, type, select,
check, screenshot, back/forward/reload, wait, tabs, and cookies.

The sidecar also exposes noVNC on port 6080 for manual login when needed
(CAPTCHAs, OAuth flows).

Tested against a live sidecar with 16 passing integration tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@TomGranot TomGranot added the Type: Skill Skill-only PR (no source code changes) label Feb 19, 2026
@TomGranot
Copy link
Copy Markdown
Collaborator

@gavrielc — Skill-only PR adding Chrome browser via Docker sidecar with CDP tools. Already labeled. Worth checking how it relates to the existing agent-browser.md skill.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Type: Skill Skill-only PR (no source code changes)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants