fix(container): pre-install Gmail/Notion MCP, log MCP diagnostics#1810
Closed
topcoder1 wants to merge 611 commits into
Closed
fix(container): pre-install Gmail/Notion MCP, log MCP diagnostics#1810topcoder1 wants to merge 611 commits into
topcoder1 wants to merge 611 commits into
Conversation
Update telegram.test.ts expectations to match HTML parse_mode change, add browser module mocks to index.test.ts and routing.test.ts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds calendar_events (with time-range index), thread_links (composite PK + item index), and idx_tracked_thread on tracked_items. Includes tests for insert/retrieve and upsert/conflict enforcement. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add calendar-poller.ts with storeCalendarEvents, getUpcomingEvents, getEventsInRange, pollCalendar, startCalendarPoller, stopCalendarPoller, and cleanupOldEvents. Uses INSERT OR REPLACE upsert semantics, parses flexible time formats (epoch ms/s, ISO string, Google Calendar dateTime objects), and emits calendar.synced events via the event bus. 9 tests covering storage, upsert, range queries, and boundary conditions. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds thread-correlator module that links TrackedItems to calendar events via attendee matching and to existing threads via subject normalization (RE:/FWD: stripping), storing links in thread_links with INSERT OR IGNORE and emitting thread.correlated events. 11 tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements scheduling-advisor with findCalendarGaps, isInMeeting, scoreUrgency, suggestDeliveryTime, and getNextMeetingIn. All 13 tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Items sharing a thread_id are now grouped together in the FYI section of the digest output. Multi-item threads show a normalized title with count; single-item threads and threadless items continue using existing source-grouped format. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…duling flow Exercises the full pipeline: store calendar events, insert TrackedItems, correlateByAttendee, isInMeeting, scoreUrgency, suggestDeliveryTime, and PushBuffer hold behavior — both during and outside a meeting. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Dual-runtime architecture: claude-agent-sdk for Claude, Vercel AI SDK for all other providers. Covers provider config, utility LLM service, auto-escalation, IPC/MCP tool bridging, and session persistence. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix normalizeSubject to use while loop for stripping unlimited RE:/FWD: prefixes. Fix digest-engine test to use valid 'digest' classification instead of 'fyi'. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Covers: provider resolution, auto-escalation, session store, IPC tool bridge, Vercel AI SDK agent runner, host wiring, utility LLM service, and container rebuild. TDD throughout. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Introduces `LlmConfig` interface in `src/types.ts` and `resolveModel()` in `src/llm/provider.ts` for host-side LLM provider/model selection before container dispatch. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…PC classification Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements scoreComplexity() with heuristic signals (message length, code blocks, keywords, question count, file references) to decide whether to upgrade to a stronger model before agent dispatch. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements saveSession/loadSession for persisting CoreMessage[] arrays as JSON files, with 100-message trim and UUID-based session ID generation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tion Wire shouldRequireApproval/recordDelegation into handleEvaluate so that handle_* tools approved by the trust engine are held for user approval until the delegation counter reaches threshold. Add integration tests verifying classifyTool mapping and per-class counter independence. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nters Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Capture startMiniAppServer return to get the PendingSendRegistry, pass eventBus to the server, call registry.shutdown() before queue.shutdown() during graceful shutdown, and subscribe to email.draft.send_failed to notify the main Telegram group with Retry/Open-in-Gmail actions. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Email alerts remaining: tracks A + C + D
chore: move smoke scripts to scripts/dev/
Two related bugs caused the Telegram "Want me to forward X?" message to
show no action buttons, and Yes/No clicks (when they did appear) to be
cosmetic — the agent never learned the user's answer.
1. Agent-authored IPC messages bypassed classifyAndFormat. The container's
send_message / relay_message tools write type:"message" IPC files that
the host delivered via plain channel.sendMessage, skipping question
detection and inline-keyboard attachment. Add sendAgentMessage to
IpcDeps, implemented in index.ts to run formatOutbound →
classifyAndFormat → sendMessageWithActions (falling back to plain send
when the channel lacks keyboards or no actions were detected). Both
the type:"message" handler and relay_message now use it.
2. answer:yes/no callbacks only removed a status-bar item. Wire an
injectUserReply dep that pipes a synthesized reply ("✅ Yes — proceed."
/ "❌ No — do not proceed.") back into the active container via
queue.sendMessage, or enqueues a message check if no container is
running. Also clear the buttons after the click so the user sees the
answer was registered.
Tests: new cases in callback-router.test.ts cover answer:yes, answer:no,
and answer:defer. ipc-relay.test.ts updated to assert the new
sendAgentMessage call site. email-trigger-pipeline and ipc-auth stubs
extended for the new IpcDeps field.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…est matrix Addresses three gaps uncovered while triaging Telegram bot UX: 1. Retry on Gmail failures. The generic error handler in callback-router used to replace the message with "⚠️ expand failed: ..." and strip all buttons, leaving the user stuck during a transient outage. For expand / archive / confirm_archive failures it now renders 🔄 Retry + ❌ Dismiss. The new retry:<action>:<entity>[:extra] and dismiss_failure:<id> cases re-dispatch the original call or clear the keyboard. confirm_archive's existing dedicated retry_archive path is unchanged. 2. Email context for agent-authored messages. The container's send_message tool now accepts email_id + email_account. The host IpcDeps.sendAgentMessage plumbs them through and attaches 📧 Expand / 🌐 Full Email / 🗄 Archive when provided — the same button set email triggers get — so ad-hoc agent messages about a specific email (e.g. follow-up questions) carry the same affordances as the original notification. 3. Exhaustive callback matrix test. New src/__tests__/telegram-callback-matrix.test.ts covers every top-level action (archive, confirm_archive, answer yes/no, dismiss, stop, unknown), the Gmail-outage retry flow, and guards against retry buttons leaking onto non-retryable failures (rsvp). 11 cases, all headless — no real bot token needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…sage The send_message IPC tool accepts email_id/email_account (added in the previous commit), but the agent only uses capabilities it's told about. Update the three instruction surfaces so this becomes habit: 1. groups/main/CLAUDE.md Communication section: add a rule that every email-specific message must carry both fields; omit for batch/general chat. 2. groups/main/CLAUDE.md Email Intelligence step 7: when reporting results via send_message, pass email_id and email_account so the user gets Expand / Full Email / Archive buttons. 3. container/skills/capabilities/SKILL.md send_message bullet: same guidance for non-email-intelligence flows. 4. src/ipc.ts email-trigger prompt: add the instruction to the system- injected trigger prompt so the agent sees it even if the per-group CLAUDE.md drifts. New test in email-trigger-pipeline.test.ts verifies the generated prompt contains email_id, email_account, and the three button names. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Follow-ups to the previous round of Telegram UX fixes: #3 Auto-infer email context. Even when the agent forgets to pass email_id/email_account, the host scrapes the outgoing text for (thread: <id>) and [account] markers and infers context. Instructions drift, code doesn't. New src/email-context-inference.ts, plugged into sendAgentMessage in index.ts. 9 unit tests cover single-thread, multi-thread (returns null), short-match false positives, blocklist words like [email]/[internal], and multi-account ambiguity. #4 Unified retry dispatcher. confirm_archive's failure handler used to emit retry_archive:<id> and we had a dedicated case for it. The new retry:<action>:<entity>[:extra] dispatcher already covers this, so confirm_archive now emits retry:confirm_archive:<id>. retry_archive stays as a back-compat alias that re-dispatches through the unified path. Single code path for all retries. #5 Person-name forward. FORWARD_PERSON_PATTERN matches "forward X to <2-3 capitalized words>?" (e.g. "Philip Ye") when the email-address pattern doesn't. Emits 📨 Forward to <Name> + ❌ No. On click, the host injects a reply telling the agent to resolve the name via search_contacts and forward — the container already has that tool. Single-word names stay ambiguous and fall through to generic Yes/No. Also adds scripts/dev/smoke-telegram-callbacks.ts — manual live-bot smoke test that posts real messages through the Bot API to verify server-side keyboard rendering (HTML escaping, web_app URLs, edit-in- place, long labels). Not wired into CI. Host service was restarted to pick up these changes. Container image rebuild is blocked by a pre-existing broken state in container/agent-runner (missing @ai-sdk/mcp, ai, @ai-sdk/openai packages); the new email_id argument to the container send_message tool will activate when that build is fixed separately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
container/agent-runner/src/mcp-bridge.ts and vercel-runner.ts have been importing @ai-sdk/mcp, @ai-sdk/mcp/mcp-stdio, ai, @ai-sdk/openai, and @ai-sdk/google since landed, but none were listed in package.json. The container build has been failing ever since, and src/llm/mcp-bridge.test.ts was failing with "Cannot find package '@ai-sdk/mcp/mcp-stdio'" on every CI run. Add the four missing packages at current latest-major versions: - @ai-sdk/mcp ^1.0.36 - ai ^6.0.168 - @ai-sdk/openai ^3.0.53 - @ai-sdk/google ^3.0.64 npm install brings in 114 packages; the container typecheck and build now succeed, and the host test suite goes from 1451/1455 to 1455/1455 passing — the 4 failing tests all rooted in the same missing import. This also activates the email_id / email_account args added to the send_message MCP tool last commit — the container now has the updated binary. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… telemetry Three follow-ups to the Telegram callback UX work: #3 Host-side contact lookup for forward_person. When the user taps 📨 Forward to <Name> and the macOS Contacts DB has exactly one email for that name, inject the resolved address directly — skipping the agent's search_contacts round-trip. On miss / ambiguity / DB not available, fall back to delegating the lookup to the agent (which has the container-side search_contacts MCP tool). Read-only: copy-to-tmp pattern avoids WAL lock contention. 6 unit tests cover missing dir, empty query, dedup, ambiguity, non-email rows, and sqlite errors. #4 Yes/No emoji consistency. Every other button in the system uses an emoji prefix (📧 Expand, 🌐 Full Email, 🗄 Archive, 📨 Forward, 🔄 Retry, ❌ Cancel). The generic yes-no question pair was plain text "Yes" / "No" / "Let me think...". Now ✅ Yes / ❌ No / ⏳ Let me think…. Matches the visual vocabulary of the rest of the keyboard. #5 Telemetry for email context resolution. sendAgentMessage now logs whether the Expand/Full Email/Archive buttons came from an agent-explicit email_id or from host-side inference. Lets us answer "are instructions sticking?" from the log without sampling message bodies. Full test suite: 1462/1462 passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Eliminates the main source of container-side Gmail flakiness: every agent turn was cold-starting the Gmail MCP via `npx -y` per account, which times out under load or flaky network and leaves the SDK running without Gmail tools. - Dockerfile: bake @gongrzhe/server-gmail-autoauth-mcp and @notionhq/notion-mcp-server into the image so npx resolves from local cache instead of the npm registry. - container-runner.ts: on exit-0 runs, grep stderr for [mcp-probe] and MCP/gmail lines and append them as "=== MCP Diagnostics ===" in the container log. Previously these were discarded unless the container exited non-zero, so "Gmail tools offline" complaints had no forensic trail. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
gavrielc
pushed a commit
that referenced
this pull request
Apr 24, 2026
Adds /add-gmail-tool — a Utility skill that installs Gmail as an MCP tool in NanoClaw v2 using OneCLI for credential injection. No raw OAuth tokens ever reach the container; the gateway swaps the "onecli-managed" stub bearer for the real token at request time. Scope (3 files): - container/Dockerfile: pnpm global-install of @gongrzhe/server-gmail-autoauth-mcp@1.1.11, pinned behind GMAIL_MCP_VERSION. Also pins zod-to-json-schema@3.22.5 to avoid an ERR_PACKAGE_PATH_NOT_EXPORTED crash: the MCP server's loose zod range resolves zod@3.24.x while zod-to-json-schema@3.25.x imports the zod/v3 subpath that only exists in zod>=3.25. - container/agent-runner/src/providers/claude.ts: adds 'mcp__gmail__*' to TOOL_ALLOWLIST so the agent can invoke the server's tools. - .claude/skills/add-gmail-tool/SKILL.md: pre-flight checks (OneCLI Gmail app connected, stubs present, mount allowlist covers ~/.gmail-mcp, agent secret-mode), per-group wiring in container.json (mount + mcpServers), verification steps, troubleshooting, removal instructions. Credits to gongrzhe for the MCP server and the add-atomic-chat-tool / add-vercel skill patterns. Addresses #1500 (proxy Gmail OAuth through credential proxy) on the Gmail side. Overlaps in intent with #1810 but stays surgical — no bundled unrelated changes. Tested end-to-end on Linux/Docker: CLI and WhatsApp self-chat agents can list labels, search/read/send mail via OneCLI-injected tokens. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Collaborator
|
Closing this one. It looks like a fair amount of personal install state (per group If there is a focused change you would still like to land, please open a new PR with just that diff and we will take a look. Thanks for the interest. |
nv-slang-bot Bot
pushed a commit
to slang-coworkers/nanoclaw
that referenced
this pull request
May 12, 2026
Adds /add-gmail-tool — a Utility skill that installs Gmail as an MCP tool in NanoClaw v2 using OneCLI for credential injection. No raw OAuth tokens ever reach the container; the gateway swaps the "onecli-managed" stub bearer for the real token at request time. Scope (3 files): - container/Dockerfile: pnpm global-install of @gongrzhe/server-gmail-autoauth-mcp@1.1.11, pinned behind GMAIL_MCP_VERSION. Also pins zod-to-json-schema@3.22.5 to avoid an ERR_PACKAGE_PATH_NOT_EXPORTED crash: the MCP server's loose zod range resolves zod@3.24.x while zod-to-json-schema@3.25.x imports the zod/v3 subpath that only exists in zod>=3.25. - container/agent-runner/src/providers/claude.ts: adds 'mcp__gmail__*' to TOOL_ALLOWLIST so the agent can invoke the server's tools. - .claude/skills/add-gmail-tool/SKILL.md: pre-flight checks (OneCLI Gmail app connected, stubs present, mount allowlist covers ~/.gmail-mcp, agent secret-mode), per-group wiring in container.json (mount + mcpServers), verification steps, troubleshooting, removal instructions. Credits to gongrzhe for the MCP server and the add-atomic-chat-tool / add-vercel skill patterns. Addresses nanocoai#1500 (proxy Gmail OAuth through credential proxy) on the Gmail side. Overlaps in intent with nanocoai#1810 but stays surgical — no bundled unrelated changes. Tested end-to-end on Linux/Docker: CLI and WhatsApp self-chat agents can list labels, search/read/send mail via OneCLI-injected tokens. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
@gongrzhe/server-gmail-autoauth-mcpand@notionhq/notion-mcp-serverinto the container image — eliminatesnpx -ycold-start on every agent turn, which was timing out under load and leaving the SDK running without Gmail tools.[mcp-probe],MCP server,gmail) to the container log even on exit 0, so "Gmail tools offline" complaints have forensic evidence instead of a silent discard.Root cause
Every container run spawned four Gmail MCP servers via
npx -y @gongrzhe/server-gmail-autoauth-mcp. Each invocation hit the npm registry for dependency resolution; on flaky network or registry rate-limits the spawn timed out, the SDK initialized without those tools, and the agent reported "Gmail tools offline." Meanwhile the container logs only captured stdout/stderr on non-zero exits — so these failures left no trail.Test plan
nanoclaw-agent:latestrebuilt via./container/build.sh(14.9s), packages confirmed baked into the global npm prefix.npm run buildclean; service restarted; new container-runner deployed.[mcp-probe] gmail-personal: FAIL …inside a new=== MCP Diagnostics ===section ingroups/*/logs/container-*.log.🤖 Generated with Claude Code