feat: Add Hetzner Cloud backend for ephemeral VM agents by Peyton-Spencer · Pull Request #12 · omniaura/nanoclaw

Peyton-Spencer · 2026-02-13T23:27:18Z

Summary

Add Hetzner Cloud backend to support ephemeral VM-based agent execution with S3-based I/O. This provides a cost-effective alternative to persistent cloud backends like Sprites/Railway by creating VMs on-demand and destroying them immediately after use.

New files

File	Purpose
`src/backends/hetzner-api.ts`	Hetzner Cloud API wrapper with server and SSH key lifecycle management
`src/backends/hetzner-backend.ts`	Hetzner backend implementation following Railway's S3 inbox/outbox pattern

Changes

Add 'hetzner' to BackendType union in src/backends/types.ts
Register HetznerBackend in src/backends/index.ts
Add Hetzner config vars to src/config.ts:
- HETZNER_API_TOKEN - API token from Hetzner Cloud Console
- HETZNER_LOCATION - Datacenter location (default: ash - Ashburn, US)
- HETZNER_SERVER_TYPE - VM size (default: cpx11 - 2 vCPU, 2GB RAM, €4.15/mo)
- HETZNER_IMAGE - OS image (default: ubuntu-22.04)

Architecture

Ephemeral VM Lifecycle

Create VM: API call to Hetzner Cloud creates server with cloud-init
Bootstrap: Cloud-init installs Docker and starts nanoclaw container
Execute: Agent polls S3 inbox, executes task, writes to S3 outbox
Destroy: VM is deleted immediately after agent completes (ephemeral!)

S3-based I/O (same pattern as Railway)

Host writes prompt to S3 inbox → Agent polls inbox
Agent writes results to S3 outbox → Host polls outbox
File sync via S3 for workspace files

SSH Key Management

Generate RSA keypair on agent creation
Upload public key to Hetzner API
Include SSH key ID in server creation request
Delete SSH key when destroying server

Cloud-init Startup Script

#cloud-config
package_update: true
package_upgrade: true

packages:
  - docker.io
  - docker-compose

runcmd:
  - systemctl start docker
  - systemctl enable docker
  - docker pull <CONTAINER_IMAGE>
  - docker run -d --name nanoclaw-agent \
      -e NANOCLAW_S3_ENDPOINT=... \
      -e NANOCLAW_S3_BUCKET=... \
      -e NANOCLAW_AGENT_ID=... \
      <CONTAINER_IMAGE>

Benefits

Cost-Effective

Hourly billing: ~€0.006/hr for cpx11 (vs €4.15/mo for persistent)
Ephemeral: Only pay for actual usage time (minutes to hours)
Example: 2 hours of agent work = €0.012 total cost

Simple Infrastructure

Standard VMs (no SaaS abstraction layer)
Proven, reliable infrastructure
Easy to debug (SSH access if needed)

US Datacenter Support

Ashburn (ash) and Hillsboro (hil) locations
1TB bandwidth included per server

Scalable

Multiple server types available:
- cpx11: 2 vCPU, 2GB RAM, €4.15/mo
- cpx21: 3 vCPU, 4GB RAM, €8.05/mo
- cpx31: 4 vCPU, 8GB RAM, €14.75/mo
Configure via HETZNER_SERVER_TYPE env var

Usage

Set agent backend to 'hetzner' and configure required env vars:

const agent = {
  id: 'agent-id',
  name: 'Agent Name',
  folder: 'agent-folder',
  backend: 'hetzner',
  // ...
};

export HETZNER_API_TOKEN="your-hetzner-api-token"
export B2_ENDPOINT="s3.us-west-000.backblazeb2.com"
export B2_BUCKET="your-bucket"
export B2_ACCESS_KEY_ID="..."
export B2_SECRET_ACCESS_KEY="..."

Test plan

Verify bun run build compiles cleanly
Test Hetzner API wrapper (create/delete server, SSH keys)
Test cloud-init bootstrap with Docker installation
Test full agent run with S3 inbox/outbox
Verify VM destruction after completion
Test concurrent agent runs (multiple VMs)
Test error handling (VM creation failure, timeout, etc.)

Migration Path

As discussed with @future Trees:

Current: All agents on local (Apple Container / Docker)
Next: Local → Hetzner migration for cloud agents
Why Hetzner: Simpler and more cost-effective than Sprites/Daytona/Railway SaaS sandbox providers

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added Hetzner Cloud as a supported backend infrastructure option.
- Introduced Hetzner configuration settings for API token, location, server type, and image.
- Extended backend filtering to include Hetzner.
- Enables ephemeral Hetzner VM execution with S3-based inbox/outbox I/O for agent runs.

coderabbitai · 2026-02-13T23:27:37Z

📝 Walkthrough

Walkthrough

Adds Hetzner Cloud support: new Hetzner API client and HetznerBackend for ephemeral VM provisioning with cloud-init and S3-based IPC, updates backend types/factory and configuration exports, and extends the agent-runner filter to accept 'hetzner'.

Changes

Cohort / File(s)	Summary
Backend types & integration `src/backends/types.ts`, `src/backends/index.ts`	Adds `'hetzner'` to BackendType and integrates lazy-loaded Hetzner backend into the factory.
Configuration `src/config.ts`	Exports new Hetzner configuration env values: `HETZNER_API_TOKEN`, `HETZNER_LOCATION`, `HETZNER_SERVER_TYPE`, `HETZNER_IMAGE`.
Agent-runner CLI `container/agent-runner/src/ipc-mcp-stdio.ts`	Extends `list_agents` `filter_backend` enum to include `'hetzner'`.
Hetzner API client `src/backends/hetzner-api.ts`	New HTTP wrapper with token guard, types (server, ssh key, action), CRUD operations for SSH keys/servers, action polling and wait helpers, and robust error handling/logging.
Hetzner backend implementation `src/backends/hetzner-backend.ts`	New exported `HetznerBackend` implementing agent run flow: S3-based inbox/outbox IPC, cloud-init VM provisioning, ephemeral server lifecycle, polling for results, process wrapper abstraction, and cleanup/shutdown logic.

Sequence Diagram

sequenceDiagram
    actor Client
    participant HB as HetznerBackend
    participant S3 as S3 Storage
    participant HAPI as Hetzner API
    participant HC as Hetzner Cloud / VM

    Client->>HB: runAgent(group, input)
    HB->>S3: upload workspace + push inbox message
    HB->>HAPI: createServer(cloud-init)
    HAPI->>HC: POST /servers
    HC-->>HAPI: server id + action id
    HAPI-->>HB: {server, action}
    HB->>HAPI: waitForServerRunning(server_id)
    loop polling
      HAPI->>HC: GET /servers/{id}
      HC-->>HAPI: status
    end
    HC->>S3: VM/agent reads inbox, runs container, writes outbox
    loop poll outbox
      HB->>S3: check outbox for results
      S3-->>HB: output files / signals
    end
    HB->>HAPI: deleteServer(server_id)
    HAPI->>HC: DELETE /servers/{id}
    HB-->>Client: return ContainerOutput

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hopped through clouds of Hetzner light,

spun up a VM, kept IPC tight.
S3 crumbs left the path so neat,
ephemeral paws danced on bare-iron heat.
New backend blooms — a carrot-shaped byte.

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: Add Hetzner Cloud backend for ephemeral VM agents' clearly and accurately summarizes the main objective of the changeset, which introduces a complete Hetzner Cloud backend implementation with ephemeral VM support.
Docstring Coverage	✅ Passed	Docstring coverage is 90.00% which is sufficient. The required threshold is 80.00%.
Merge Conflict Detection	✅ Passed	✅ No merge conflicts detected when merging into `main`

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/hetzner-backend

No actionable comments were generated in the recent review. 🎉

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

🤖 Fix all issues with AI agents

In `@container/agent-runner/src/ipc-mcp-stdio.ts`:
- Around line 576-577: The filter_backend enum definition (the z.enum call for
filter_backend) is missing the new 'hetzner' option—add 'hetzner' to the enum
array so users can filter by the Hetzner backend; update the
z.enum(['apple-container','docker','sprites','daytona','railway']) to include
'hetzner' and run/typecheck any code that consumes filter_backend to ensure
compatibility (e.g., places that narrow or switch on filter_backend values).

In `@src/backends/hetzner-api.ts`:
- Around line 44-52: The code calls await resp.json() unconditionally which will
throw on empty bodies (e.g. 204 No Content); modify the block in
src/backends/hetzner-api.ts to first detect empty responses (check resp.status
=== 204 or resp.headers.get('content-length') === '0' or
resp.headers.get('content-type') missing) and only call await resp.json() when a
body exists; if empty, set json to null/undefined and handle error branching so
the existing resp.ok check still throws Hetzner errors when appropriate and
successful empty responses return a sensible value (e.g., undefined) castable to
T.

In `@src/backends/hetzner-backend.ts`:
- Around line 223-244: The cloud-init generated by generateCloudInit embeds
sensitive B2/S3 credentials (B2_ACCESS_KEY_ID, B2_SECRET_ACCESS_KEY, B2_BUCKET,
B2_ENDPOINT, B2_REGION) into the user-data which can be exposed; update
generateCloudInit to stop writing secrets into the cloud-init payload (keep only
non-sensitive values like agentId and CONTAINER_IMAGE) and instead either (a)
document this limitation clearly and require callers to supply scoped/temporary
credentials, or (b) change the bootstrap flow so the container fetches
credentials at runtime from a secure mechanism (instance metadata service, a
secrets endpoint, or an ephemeral token service) rather than baking them into
runcmd; reference generateCloudInit, CONTAINER_IMAGE and the B2_* symbols when
implementing the change.
- Around line 312-320: The initialize() method currently returns early when
HETZNER_API_TOKEN or B2_ENDPOINT are missing, leaving this.s3 uninitialized and
causing runAgent() to crash when it calls syncFilesToS3(this.s3,...); either
make initialize() fail fast by throwing a descriptive Error when required env
vars are missing (so callers cannot proceed without a valid Hetzner backend) or
add a defensive guard at the start of runAgent() to check that this.s3 is
initialized (log/throw a clear error and return) before calling syncFilesToS3;
update the code paths around initialize(), runAgent(), this.s3, and
syncFilesToS3 to ensure one of these fixes is applied consistently.
- Around line 247-257: The pemToOpenSSH method is producing invalid OpenSSH keys
by naively slicing the PEM; replace its logic to parse the SPKI PEM properly
using the sshpk library: import sshpk, call sshpk.parseKey(pemPublicKey, "pem")
and then serialize with key.toString("ssh") in pemToOpenSSH so Hetzner receives
a valid OpenSSH-formatted public key (add sshpk as a dependency).
- Around line 57-331: HetznerBackend is missing the shutdown(): Promise<void>
required by AgentBackend — add an async shutdown method on the HetznerBackend
class that cleanly tears down resources: call destroyEphemeralServer for any
tracked servers (use this.servers.keys() or iterate this.servers to remove
them), await deletion of any remaining SSH keys/servers via
destroyEphemeralServer or HetznerAPI, and gracefully close or flush the
NanoClawS3 client (this.s3) if it exposes a close/cleanup method; ensure
shutdown returns a Promise<void> and does not throw on missing initialization.

🧹 Nitpick comments (4)

src/backends/hetzner-backend.ts (3)
10-11: Minor: Redundant crypto imports.

crypto is imported as default and generateKeyPairSync is imported separately. Consider consolidating.
♻️ Suggested consolidation
-import crypto from 'crypto';
-import { generateKeyPairSync } from 'crypto';
+import crypto, { generateKeyPairSync } from 'crypto';
164-169: Unused privateKey variable.

The privateKey is generated but never used. If SSH access isn't needed for ephemeral VMs, consider documenting this explicitly or removing the unnecessary generation.
♻️ If private key isn't needed
-    const { publicKey, privateKey } = generateKeyPairSync('rsa', {
+    const { publicKey } = generateKeyPairSync('rsa', {
       modulusLength: 2048,
       publicKeyEncoding: { type: 'spki', format: 'pem' },
-      privateKeyEncoding: { type: 'pkcs8', format: 'pem' },
     });
149-154: Consider returning error status on timeout.

When the agent times out, it returns lastOutput which may still have status: 'success' with result: null. This could be misleading to callers. Consider explicitly returning an error status for timeouts.
♻️ Explicit timeout error
       logger.warn({ group: groupName, timeout: configTimeout }, 'Hetzner agent timed out waiting for S3 outbox');
-      return lastOutput;
+      return { status: 'error', result: lastOutput.result, error: `Agent timed out after ${configTimeout}ms` };
src/backends/hetzner-api.ts (1)

11-13: Generic response type is confusing and imprecise.

HetznerResponse<T> with [key: string]: T doesn't accurately model Hetzner's actual response structure. Different endpoints return different shapes:

/ssh_keys → { ssh_key: {...} }

/servers → { server: {...}, action: {...} }

/actions → { action: {...} }

Consider using specific response types or a union to make the code more self-documenting.

coderabbitai · 2026-02-13T23:32:00Z

+  private generateCloudInit(agentId: string): string {
+    return `#cloud-config
+package_update: true
+package_upgrade: true
+
+packages:
+  - docker.io
+  - docker-compose
+
+runcmd:
+  - systemctl start docker
+  - systemctl enable docker
+  - docker pull ${CONTAINER_IMAGE}
+  - docker run -d --name nanoclaw-agent \\
+      -e NANOCLAW_S3_ENDPOINT=${B2_ENDPOINT} \\
+      -e NANOCLAW_S3_REGION=${B2_REGION} \\
+      -e NANOCLAW_S3_ACCESS_KEY_ID=${B2_ACCESS_KEY_ID} \\
+      -e NANOCLAW_S3_SECRET_ACCESS_KEY=${B2_SECRET_ACCESS_KEY} \\
+      -e NANOCLAW_S3_BUCKET=${B2_BUCKET} \\
+      -e NANOCLAW_AGENT_ID=${agentId} \\
+      ${CONTAINER_IMAGE}
+`;


⚠️ Potential issue | 🟡 Minor

Security consideration: Credentials in cloud-init user-data.

S3/B2 credentials (access key ID and secret) are embedded in the cloud-init script which may be visible in:

Hetzner Cloud console (server details)

VM logs (/var/log/cloud-init.log)

Instance metadata endpoints

For ephemeral short-lived VMs this risk may be acceptable, but consider documenting this limitation or exploring alternatives like:

Using scoped/temporary credentials with limited permissions

Hetzner Cloud's metadata service with encrypted user-data

🤖 Prompt for AI Agents

In `@src/backends/hetzner-backend.ts` around lines 223 - 244, The cloud-init generated by generateCloudInit embeds sensitive B2/S3 credentials (B2_ACCESS_KEY_ID, B2_SECRET_ACCESS_KEY, B2_BUCKET, B2_ENDPOINT, B2_REGION) into the user-data which can be exposed; update generateCloudInit to stop writing secrets into the cloud-init payload (keep only non-sensitive values like agentId and CONTAINER_IMAGE) and instead either (a) document this limitation clearly and require callers to supply scoped/temporary credentials, or (b) change the bootstrap flow so the container fetches credentials at runtime from a secure mechanism (instance metadata service, a secrets endpoint, or an ephemeral token service) rather than baking them into runcmd; reference generateCloudInit, CONTAINER_IMAGE and the B2_* symbols when implementing the change.

Peyton-Spencer · 2026-02-14T00:23:21Z

Merge Conflicts Resolved ✅

Successfully merged main into feat/hetzner-backend. All conflicts are resolved.

The Hetzner backend implementation looks solid! Key strengths:

Architecture ✨

Ephemeral VMs: Cost-effective hourly billing (~€0.006/hr for cpx11)
S3-based I/O: Same proven pattern as Railway backend
Cloud-init bootstrap: Clean Docker container deployment
Full lifecycle management: SSH keys + server creation/deletion

Implementation Quality

Comprehensive error handling
Proper resource cleanup (SSH keys deleted after VM destruction)
Polling with timeouts for async operations
Extensive logging for debuggability

Next Steps

Test full agent run with real Hetzner account
Verify concurrent agent execution (multiple VMs)
Confirm cleanup on error paths
Document cost comparison vs. Sprites/Railway

This provides a great middle ground between local execution and full SaaS platforms. Ready for testing! 🚀

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@src/backends/hetzner-backend.ts`:
- Around line 163-189: createEphemeralServer can orphan a VM if
HetznerAPI.createServer succeeds but HetznerAPI.waitForAction or
HetznerAPI.waitForServerRunning throws; wrap the post-create waits in a
try/catch/finally and on any failure perform a best‑effort cleanup by calling
the Hetzner delete method (e.g., HetznerAPI.deleteServer(server.id)) and log the
cleanup attempt with logger.error/info including serverId and error details;
ensure the function rethrows the original error after cleanup so callers still
see the failure.
- Around line 164-245: The code hard-codes the project name "nanoclaw" in
server/container identifiers (serverName in the ephemeral server creation and
string literals inside generateCloudInit); make this configurable by introducing
an APP_NAME (or similar) config/env var with a neutral default (e.g., "app") and
replace occurrences of the literal "nanoclaw" used in serverName creation, the
ssh-key comment, container name, and any mounted paths or labels inside
generateCloudInit and related logging; ensure serverName =
`${appName}-${agentId}-${Date.now()}` and all template strings inside
generateCloudInit interpolate the appName variable instead of the hard-coded
value, preserving existing behavior when the env/config value is absent.

coderabbitai · 2026-02-14T01:20:16Z

+    const serverName = `nanoclaw-${agentId}-${Date.now()}`;
+
+    // No host-side SSH key needed — VMs are fully managed via cloud-init + S3.
+    // If the agent needs git SSH keys, cloud-init generates them on the VM
+    // and the agent can share the pubkey back via S3 outbox.
+    const userData = this.generateCloudInit(agentId);
+
+    const { server, action } = await HetznerAPI.createServer(
+      serverName,
+      HETZNER_SERVER_TYPE,
+      HETZNER_IMAGE,
+      HETZNER_LOCATION,
+      [], // No SSH keys — ephemeral VM, no SSH access needed
+      userData,
+    );
+
+    await HetznerAPI.waitForAction(action.id);
+    await HetznerAPI.waitForServerRunning(server.id);
+
+    logger.info(
+      { serverId: server.id, serverName, ip: server.public_net.ipv4.ip },
+      'Hetzner ephemeral server ready',
+    );
+
+    return { serverId: server.id };
+  }
+
+  private async destroyEphemeralServer(agentId: string): Promise<void> {
+    const serverCtx = this.servers.get(agentId);
+    if (!serverCtx) {
+      logger.warn({ agentId }, 'No Hetzner server context found to destroy');
+      return;
+    }
+
+    try {
+      await HetznerAPI.deleteServer(serverCtx.serverId);
+      logger.info({ serverId: serverCtx.serverId }, 'Destroyed Hetzner ephemeral server');
+    } catch (err) {
+      logger.warn({ serverId: serverCtx.serverId, error: err }, 'Failed to destroy Hetzner server');
+    } finally {
+      this.servers.delete(agentId);
+    }
+  }
+
+  /**
+   * Generate cloud-init user-data for ephemeral Hetzner VMs.
+   *
+   * The VM generates its own SSH key for git operations via ssh-keygen.
+   * The agent can share its pubkey back to the user via S3 outbox.
+   *
+   * NOTE: B2/S3 credentials are embedded in the cloud-init script. This is acceptable
+   * for ephemeral VMs that are destroyed after each agent run, but be aware that:
+   * - Credentials may be visible in Hetzner Cloud console (server details)
+   * - Credentials persist in VM logs (/var/log/cloud-init.log) until VM destruction
+   * For higher-security deployments, consider using scoped/temporary B2 application keys
+   * with limited bucket permissions and short TTLs.
+   */
+  private generateCloudInit(agentId: string): string {
+    return `#cloud-config
+package_update: true
+package_upgrade: true
+
+packages:
+  - docker.io
+  - docker-compose
+
+runcmd:
+  - systemctl start docker
+  - systemctl enable docker
+  - ssh-keygen -t ed25519 -f /root/.ssh/id_ed25519 -N "" -C "nanoclaw-${agentId}"
+  - ssh-keyscan github.com >> /root/.ssh/known_hosts 2>/dev/null
+  - docker pull ${CONTAINER_IMAGE}
+  - docker run -d --name nanoclaw-agent \\
+      -v /root/.ssh:/home/bun/.ssh:ro \\
+      -e NANOCLAW_S3_ENDPOINT=${B2_ENDPOINT} \\
+      -e NANOCLAW_S3_REGION=${B2_REGION} \\
+      -e NANOCLAW_S3_ACCESS_KEY_ID=${B2_ACCESS_KEY_ID} \\
+      -e NANOCLAW_S3_SECRET_ACCESS_KEY=${B2_SECRET_ACCESS_KEY} \\
+      -e NANOCLAW_S3_BUCKET=${B2_BUCKET} \\
+      -e NANOCLAW_AGENT_ID=${agentId} \\
+      ${CONTAINER_IMAGE}
+`;


⚠️ Potential issue | 🟡 Minor

Remove hard-coded project name from server/container identifiers.

Project-specific identifiers (nanoclaw-*) should be configurable rather than embedded in source. Consider using a generic env var (e.g., APP_NAME) with a neutral default.

🛠️ Suggested change (make app name configurable)

- const serverName = `nanoclaw-${agentId}-${Date.now()}`; + const appName = process.env.APP_NAME || 'agent'; + const serverName = `${appName}-${agentId}-${Date.now()}`;

- - ssh-keygen -t ed25519 -f /root/.ssh/id_ed25519 -N "" -C "nanoclaw-${agentId}" + - ssh-keygen -t ed25519 -f /root/.ssh/id_ed25519 -N "" -C "${appName}-${agentId}" ... - - docker run -d --name nanoclaw-agent \\ + - docker run -d --name ${appName}-agent \\

As per coding guidelines, "NEVER edit checked-in templates or source files with user-specific content like project names, directory paths, personal preferences, or org-specific details".

🤖 Prompt for AI Agents

In `@src/backends/hetzner-backend.ts` around lines 164 - 245, The code hard-codes the project name "nanoclaw" in server/container identifiers (serverName in the ephemeral server creation and string literals inside generateCloudInit); make this configurable by introducing an APP_NAME (or similar) config/env var with a neutral default (e.g., "app") and replace occurrences of the literal "nanoclaw" used in serverName creation, the ssh-key comment, container name, and any mounted paths or labels inside generateCloudInit and related logging; ensure serverName = `${appName}-${agentId}-${Date.now()}` and all template strings inside generateCloudInit interpolate the appName variable instead of the hard-coded value, preserving existing behavior when the env/config value is absent.

Add Hetzner Cloud backend to support ephemeral VM-based agent execution with S3-based I/O. **New files:** - `src/backends/hetzner-api.ts` - Hetzner Cloud API wrapper with server and SSH key lifecycle management - `src/backends/hetzner-backend.ts` - Hetzner backend implementation with S3 inbox/outbox pattern **Changes:** - Add 'hetzner' to BackendType union in `src/backends/types.ts` - Register HetznerBackend in `src/backends/index.ts` - Add Hetzner config vars to `src/config.ts`: - HETZNER_API_TOKEN - HETZNER_LOCATION (default: ash - Ashburn, US) - HETZNER_SERVER_TYPE (default: cpx11 - 2 vCPU, 2GB RAM) - HETZNER_IMAGE (default: ubuntu-22.04) **Architecture:** - Ephemeral VMs: Create on-demand, destroy after each agent run - S3-based I/O: Host writes to inbox, agent writes to outbox (same pattern as Railway) - Cloud-init bootstrap: Installs Docker and runs nanoclaw container on VM startup - SSH key management: Generates keypair, uploads to Hetzner, includes in server creation - Cost-effective: Hourly billing (~€0.006/hr for cpx11), VMs destroyed immediately after use **Benefits:** - Lower cost than persistent cloud backends (pay per hour used, not per month) - Simple infrastructure (standard VMs, no SaaS abstraction) - US datacenter support (Ashburn, Hillsboro) - Scales easily with different server types (cpx11, cpx21, cpx31, etc.) **Usage:** Set agent.backend = 'hetzner' and configure HETZNER_API_TOKEN + B2 credentials. **Test plan:** - [ ] Verify TypeScript compilation - [ ] Test server creation/deletion API calls - [ ] Test SSH key lifecycle - [ ] Test full agent run with S3 I/O - [ ] Verify VM destruction after completion - [ ] Test with multiple concurrent agents Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Remove host-side SSH key generation entirely — VMs are ephemeral, no SSH access needed. Cloud-init runs ssh-keygen on the VM for git operations, mounts keys into the container. Agent can share pubkey via S3 outbox. - Handle HTTP 204 empty responses in Hetzner API (DELETE operations) - Add runAgent() guard for uninitialized S3 client - Add shutdown() method to clean up ephemeral servers - Add 'hetzner' to filter_backend enum in agent MCP tools - Return explicit timeout error instead of misleading success - Replace generic HetznerResponse<T> with specific response types - Document cloud-init credential exposure for ephemeral VMs - Fix pre-existing async translateJid in WhatsApp reaction handler Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Wrap waitForAction/waitForServerRunning in try/catch with best-effort server deletion to prevent orphaned VMs on startup failure - Replace hard-coded "nanoclaw" in server/container identifiers with configurable ASSISTANT_NAME Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 4

🤖 Fix all issues with AI agents

In `@src/backends/hetzner-api.ts`:
- Around line 1-4: The file-level JSDoc header includes the project-specific
name "NanoClaw"; remove that name and make the header generic by changing the
comment that currently reads "Hetzner Cloud API wrapper for NanoClaw. Provides
lifecycle management for Hetzner Cloud servers (VMs)." to a neutral form such as
"Hetzner Cloud API wrapper. Provides lifecycle management for Hetzner Cloud
servers (VMs)." — update the top-of-file header/JSDoc comment so it no longer
contains any project-specific identifiers.
- Around line 25-54: The hetznerApi function currently calls fetch without a
timeout; wrap the request in an AbortController with a configurable timeout
(e.g., constant or env var) so requests can be aborted on network hangs: create
an AbortController, set a timer to call controller.abort() after the timeout,
pass controller.signal into fetch (options.signal), and clear the timer after
fetch completes; handle the abort by catching the thrown error (e.g., checking
for DOMException/AbortError) and rethrowing or translating to a clear timeout
error. Ensure changes are applied inside hetznerApi (use the existing options
RequestInit, add signal) and clean up the timeout to avoid leaks.

In `@src/backends/hetzner-backend.ts`:
- Around line 314-330: The initialize() method currently only checks B2_ENDPOINT
before instantiating NanoClawS3, so missing B2 credentials cause later errors;
add explicit checks for B2_ACCESS_KEY_ID, B2_SECRET_ACCESS_KEY, and B2_BUCKET
before creating the NanoClawS3 client, log a clear warning (similar to the
existing B2_ENDPOINT warning) and return early if any are missing so NanoClawS3
is not constructed with undefined credentials.
- Around line 164-167: The server name built in createEphemeralServer uses
ASSISTANT_NAME.toLowerCase() directly which can include invalid characters for
Hetzner; sanitize ASSISTANT_NAME before composing serverName by normalizing to
lowercase, replacing any characters not allowed by RFC1123 (allow only a-z, 0-9,
hyphen, dot), collapsing multiple invalid chars to single hyphens, trimming
leading/trailing non-alphanumeric/hyphen/dot, and ensuring length limits
(truncate if needed) before creating serverName =
`${sanitizedAppName}-${agentId}-${Date.now()}`; update createEphemeralServer to
use the sanitizedAppName and add a small unit/validation helper (e.g.,
sanitizeServerName or normalizeAppName) to centralize the logic and reuse where
needed.

coderabbitai · 2026-02-14T03:24:53Z

+async function hetznerApi<T>(
+  method: string,
+  endpoint: string,
+  body?: unknown,
+): Promise<T> {
+  if (!HETZNER_API_TOKEN) {
+    throw new Error('HETZNER_API_TOKEN not set');
+  }
+
+  const url = `${HETZNER_API_URL}${endpoint}`;
+  const options: RequestInit = {
+    method,
+    headers: {
+      'Authorization': `Bearer ${HETZNER_API_TOKEN}`,
+      'Content-Type': 'application/json',
+    },
+  };
+
+  if (body) {
+    options.body = JSON.stringify(body);
+  }
+
+  const resp = await fetch(url, options);
+
+  // Handle empty responses (204 No Content, e.g. DELETE operations)
+  let json: unknown = {};
+  const contentLength = resp.headers.get('content-length');
+  if (resp.status !== 204 && contentLength !== '0') {
+    json = await resp.json();
+  }


⚠️ Potential issue | 🔴 Critical

Add request timeouts to avoid hanging Hetzner API calls.
fetch without a timeout can stall indefinitely on network issues and block backend workflows.

🛠️ Suggested fix (AbortController timeout)

- const resp = await fetch(url, options); + const controller = new AbortController(); + const timeoutId = setTimeout(() => controller.abort(), 30000); + let resp: Response; + try { + resp = await fetch(url, { ...options, signal: controller.signal }); + } finally { + clearTimeout(timeoutId); + }

🤖 Prompt for AI Agents

In `@src/backends/hetzner-api.ts` around lines 25 - 54, The hetznerApi function currently calls fetch without a timeout; wrap the request in an AbortController with a configurable timeout (e.g., constant or env var) so requests can be aborted on network hangs: create an AbortController, set a timer to call controller.abort() after the timeout, pass controller.signal into fetch (options.signal), and clear the timer after fetch completes; handle the abort by catching the thrown error (e.g., checking for DOMException/AbortError) and rethrowing or translating to a clear timeout error. Ensure changes are applied inside hetznerApi (use the existing options RequestInit, add signal) and clean up the timeout to avoid leaks.

… validation - Remove project-specific name from hetzner-api.ts header - Add 30s AbortController timeout to all Hetzner API fetch calls - Sanitize server names to RFC 1123 (strip invalid chars, truncate to 63) - Validate all B2 credentials (not just endpoint) before creating S3 client - Remove non-null assertions now that credentials are validated Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…omment Database improvements from stability audit: 1. **Transaction Support for deleteTask (MEDIUM)** - Wrap DELETE operations in explicit transaction - Ensures both child and parent deletions succeed atomically - Prevents partial deletion leaving orphaned task_run_logs 2. **SQL Injection Safety Documentation (HIGH)** - Add security comment to updateTask explaining safety assumptions - Document that field names are hardcoded (not user-controlled) - Warn future maintainers about SQL injection risks if logic changes Impact: - Prevents database corruption from partial task deletions - Documents security assumptions for future code reviewers - Hardens codebase against accidental SQL injection introduction Related: - Audit report: nanoclaw-stability-audit-2026-02-14.md - Issues #3, #12 from stability audit Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: stability quick wins from 2026-02-14 audit Implements three critical stability fixes identified in the audit: 1. **Unhandled Promise Rejection Handler (CRITICAL)** - Add process.on('unhandledRejection') to prevent crashes - Logs rejections instead of exiting to maintain service uptime - Prevents complete service outage from uncaught promise errors 2. **WhatsApp Event Listener Memory Leak (CRITICAL)** - Store event handlers and remove them before reconnection - Prevents exponential handler accumulation on reconnects - Fixes memory leak leading to eventual OOM crashes 3. **Group Folder Path Traversal (MEDIUM)** - Validate folder names with regex (alphanumeric + _ -) - Verify resolved paths stay within groups directory - Prevents malicious group registration from writing to arbitrary paths Impact: - Prevents process crashes from unhandled rejections - Fixes production memory leak in WhatsApp channel - Hardens security against path traversal attacks Related: - Audit report: nanoclaw-stability-audit-2026-02-14.md - Issues #1, #4, #16 from stability audit Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * fix: add transaction support to deleteTask and SQL injection safety comment Database improvements from stability audit: 1. **Transaction Support for deleteTask (MEDIUM)** - Wrap DELETE operations in explicit transaction - Ensures both child and parent deletions succeed atomically - Prevents partial deletion leaving orphaned task_run_logs 2. **SQL Injection Safety Documentation (HIGH)** - Add security comment to updateTask explaining safety assumptions - Document that field names are hardcoded (not user-controlled) - Warn future maintainers about SQL injection risks if logic changes Impact: - Prevents database corruption from partial task deletions - Documents security assumptions for future code reviewers - Hardens codebase against accidental SQL injection introduction Related: - Audit report: nanoclaw-stability-audit-2026-02-14.md - Issues #3, #12 from stability audit Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> --------- Co-authored-by: NanoClaw Agent <nanoclaw@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

coderabbitai bot reviewed Feb 13, 2026

View reviewed changes

Peyton-Spencer force-pushed the feat/hetzner-backend branch from 677dae9 to 41dec34 Compare February 14, 2026 01:13

coderabbitai bot reviewed Feb 14, 2026

View reviewed changes

nanoclaw and others added 3 commits February 13, 2026 22:16

Peyton-Spencer force-pushed the feat/hetzner-backend branch from 41dec34 to b7c2d1b Compare February 14, 2026 03:17

coderabbitai bot reviewed Feb 14, 2026

View reviewed changes

Peyton-Spencer mentioned this pull request Feb 14, 2026

fix: Critical stability improvements from 2026-02-14 audit #19

Merged

3 tasks

Peyton-Spencer merged commit 0bb9124 into main Feb 14, 2026
1 of 3 checks passed

Peyton-Spencer deleted the feat/hetzner-backend branch February 14, 2026 04:13

This was referenced Feb 14, 2026

feat: dual-lane queue for concurrent message and task execution #43

Merged

fix: address CodeRabbit review feedback for PR #115 #117

Closed

Peyton-Spencer mentioned this pull request Feb 18, 2026

refactor: extract schedule calculation utility #128

Merged

Conversation

Peyton-Spencer commented Feb 13, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

New files

Changes

Architecture

Ephemeral VM Lifecycle

S3-based I/O (same pattern as Railway)

SSH Key Management

Cloud-init Startup Script

Benefits

Cost-Effective

Simple Infrastructure

US Datacenter Support

Scalable

Usage

Test plan

Migration Path

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Peyton-Spencer commented Feb 14, 2026

Merge Conflicts Resolved ✅

Architecture ✨

Implementation Quality

Next Steps

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Peyton-Spencer commented Feb 13, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 13, 2026 •

edited

Loading