feat: Add Hetzner Cloud backend for ephemeral VM agents#12
feat: Add Hetzner Cloud backend for ephemeral VM agents#12Peyton-Spencer merged 4 commits intomainfrom
Conversation
📝 WalkthroughWalkthroughAdds Hetzner Cloud support: new Hetzner API client and HetznerBackend for ephemeral VM provisioning with cloud-init and S3-based IPC, updates backend types/factory and configuration exports, and extends the agent-runner filter to accept Changes
Sequence DiagramsequenceDiagram
actor Client
participant HB as HetznerBackend
participant S3 as S3 Storage
participant HAPI as Hetzner API
participant HC as Hetzner Cloud / VM
Client->>HB: runAgent(group, input)
HB->>S3: upload workspace + push inbox message
HB->>HAPI: createServer(cloud-init)
HAPI->>HC: POST /servers
HC-->>HAPI: server id + action id
HAPI-->>HB: {server, action}
HB->>HAPI: waitForServerRunning(server_id)
loop polling
HAPI->>HC: GET /servers/{id}
HC-->>HAPI: status
end
HC->>S3: VM/agent reads inbox, runs container, writes outbox
loop poll outbox
HB->>S3: check outbox for results
S3-->>HB: output files / signals
end
HB->>HAPI: deleteServer(server_id)
HAPI->>HC: DELETE /servers/{id}
HB-->>Client: return ContainerOutput
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
No actionable comments were generated in the recent review. 🎉 Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Fix all issues with AI agents
In `@container/agent-runner/src/ipc-mcp-stdio.ts`:
- Around line 576-577: The filter_backend enum definition (the z.enum call for
filter_backend) is missing the new 'hetzner' option—add 'hetzner' to the enum
array so users can filter by the Hetzner backend; update the
z.enum(['apple-container','docker','sprites','daytona','railway']) to include
'hetzner' and run/typecheck any code that consumes filter_backend to ensure
compatibility (e.g., places that narrow or switch on filter_backend values).
In `@src/backends/hetzner-api.ts`:
- Around line 44-52: The code calls await resp.json() unconditionally which will
throw on empty bodies (e.g. 204 No Content); modify the block in
src/backends/hetzner-api.ts to first detect empty responses (check resp.status
=== 204 or resp.headers.get('content-length') === '0' or
resp.headers.get('content-type') missing) and only call await resp.json() when a
body exists; if empty, set json to null/undefined and handle error branching so
the existing resp.ok check still throws Hetzner errors when appropriate and
successful empty responses return a sensible value (e.g., undefined) castable to
T.
In `@src/backends/hetzner-backend.ts`:
- Around line 223-244: The cloud-init generated by generateCloudInit embeds
sensitive B2/S3 credentials (B2_ACCESS_KEY_ID, B2_SECRET_ACCESS_KEY, B2_BUCKET,
B2_ENDPOINT, B2_REGION) into the user-data which can be exposed; update
generateCloudInit to stop writing secrets into the cloud-init payload (keep only
non-sensitive values like agentId and CONTAINER_IMAGE) and instead either (a)
document this limitation clearly and require callers to supply scoped/temporary
credentials, or (b) change the bootstrap flow so the container fetches
credentials at runtime from a secure mechanism (instance metadata service, a
secrets endpoint, or an ephemeral token service) rather than baking them into
runcmd; reference generateCloudInit, CONTAINER_IMAGE and the B2_* symbols when
implementing the change.
- Around line 312-320: The initialize() method currently returns early when
HETZNER_API_TOKEN or B2_ENDPOINT are missing, leaving this.s3 uninitialized and
causing runAgent() to crash when it calls syncFilesToS3(this.s3,...); either
make initialize() fail fast by throwing a descriptive Error when required env
vars are missing (so callers cannot proceed without a valid Hetzner backend) or
add a defensive guard at the start of runAgent() to check that this.s3 is
initialized (log/throw a clear error and return) before calling syncFilesToS3;
update the code paths around initialize(), runAgent(), this.s3, and
syncFilesToS3 to ensure one of these fixes is applied consistently.
- Around line 247-257: The pemToOpenSSH method is producing invalid OpenSSH keys
by naively slicing the PEM; replace its logic to parse the SPKI PEM properly
using the sshpk library: import sshpk, call sshpk.parseKey(pemPublicKey, "pem")
and then serialize with key.toString("ssh") in pemToOpenSSH so Hetzner receives
a valid OpenSSH-formatted public key (add sshpk as a dependency).
- Around line 57-331: HetznerBackend is missing the shutdown(): Promise<void>
required by AgentBackend — add an async shutdown method on the HetznerBackend
class that cleanly tears down resources: call destroyEphemeralServer for any
tracked servers (use this.servers.keys() or iterate this.servers to remove
them), await deletion of any remaining SSH keys/servers via
destroyEphemeralServer or HetznerAPI, and gracefully close or flush the
NanoClawS3 client (this.s3) if it exposes a close/cleanup method; ensure
shutdown returns a Promise<void> and does not throw on missing initialization.
🧹 Nitpick comments (4)
src/backends/hetzner-backend.ts (3)
10-11: Minor: Redundant crypto imports.
cryptois imported as default andgenerateKeyPairSyncis imported separately. Consider consolidating.♻️ Suggested consolidation
-import crypto from 'crypto'; -import { generateKeyPairSync } from 'crypto'; +import crypto, { generateKeyPairSync } from 'crypto';
164-169: UnusedprivateKeyvariable.The
privateKeyis generated but never used. If SSH access isn't needed for ephemeral VMs, consider documenting this explicitly or removing the unnecessary generation.♻️ If private key isn't needed
- const { publicKey, privateKey } = generateKeyPairSync('rsa', { + const { publicKey } = generateKeyPairSync('rsa', { modulusLength: 2048, publicKeyEncoding: { type: 'spki', format: 'pem' }, - privateKeyEncoding: { type: 'pkcs8', format: 'pem' }, });
149-154: Consider returning error status on timeout.When the agent times out, it returns
lastOutputwhich may still havestatus: 'success'withresult: null. This could be misleading to callers. Consider explicitly returning an error status for timeouts.♻️ Explicit timeout error
logger.warn({ group: groupName, timeout: configTimeout }, 'Hetzner agent timed out waiting for S3 outbox'); - return lastOutput; + return { status: 'error', result: lastOutput.result, error: `Agent timed out after ${configTimeout}ms` };src/backends/hetzner-api.ts (1)
11-13: Generic response type is confusing and imprecise.
HetznerResponse<T>with[key: string]: Tdoesn't accurately model Hetzner's actual response structure. Different endpoints return different shapes:
/ssh_keys→{ ssh_key: {...} }/servers→{ server: {...}, action: {...} }/actions→{ action: {...} }Consider using specific response types or a union to make the code more self-documenting.
| private generateCloudInit(agentId: string): string { | ||
| return `#cloud-config | ||
| package_update: true | ||
| package_upgrade: true | ||
|
|
||
| packages: | ||
| - docker.io | ||
| - docker-compose | ||
|
|
||
| runcmd: | ||
| - systemctl start docker | ||
| - systemctl enable docker | ||
| - docker pull ${CONTAINER_IMAGE} | ||
| - docker run -d --name nanoclaw-agent \\ | ||
| -e NANOCLAW_S3_ENDPOINT=${B2_ENDPOINT} \\ | ||
| -e NANOCLAW_S3_REGION=${B2_REGION} \\ | ||
| -e NANOCLAW_S3_ACCESS_KEY_ID=${B2_ACCESS_KEY_ID} \\ | ||
| -e NANOCLAW_S3_SECRET_ACCESS_KEY=${B2_SECRET_ACCESS_KEY} \\ | ||
| -e NANOCLAW_S3_BUCKET=${B2_BUCKET} \\ | ||
| -e NANOCLAW_AGENT_ID=${agentId} \\ | ||
| ${CONTAINER_IMAGE} | ||
| `; |
There was a problem hiding this comment.
Security consideration: Credentials in cloud-init user-data.
S3/B2 credentials (access key ID and secret) are embedded in the cloud-init script which may be visible in:
- Hetzner Cloud console (server details)
- VM logs (
/var/log/cloud-init.log) - Instance metadata endpoints
For ephemeral short-lived VMs this risk may be acceptable, but consider documenting this limitation or exploring alternatives like:
- Using scoped/temporary credentials with limited permissions
- Hetzner Cloud's metadata service with encrypted user-data
🤖 Prompt for AI Agents
In `@src/backends/hetzner-backend.ts` around lines 223 - 244, The cloud-init
generated by generateCloudInit embeds sensitive B2/S3 credentials
(B2_ACCESS_KEY_ID, B2_SECRET_ACCESS_KEY, B2_BUCKET, B2_ENDPOINT, B2_REGION) into
the user-data which can be exposed; update generateCloudInit to stop writing
secrets into the cloud-init payload (keep only non-sensitive values like agentId
and CONTAINER_IMAGE) and instead either (a) document this limitation clearly and
require callers to supply scoped/temporary credentials, or (b) change the
bootstrap flow so the container fetches credentials at runtime from a secure
mechanism (instance metadata service, a secrets endpoint, or an ephemeral token
service) rather than baking them into runcmd; reference generateCloudInit,
CONTAINER_IMAGE and the B2_* symbols when implementing the change.
Merge Conflicts Resolved ✅Successfully merged The Hetzner backend implementation looks solid! Key strengths: Architecture ✨
Implementation Quality
Next Steps
This provides a great middle ground between local execution and full SaaS platforms. Ready for testing! 🚀 |
677dae9 to
41dec34
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/backends/hetzner-backend.ts`:
- Around line 163-189: createEphemeralServer can orphan a VM if
HetznerAPI.createServer succeeds but HetznerAPI.waitForAction or
HetznerAPI.waitForServerRunning throws; wrap the post-create waits in a
try/catch/finally and on any failure perform a best‑effort cleanup by calling
the Hetzner delete method (e.g., HetznerAPI.deleteServer(server.id)) and log the
cleanup attempt with logger.error/info including serverId and error details;
ensure the function rethrows the original error after cleanup so callers still
see the failure.
- Around line 164-245: The code hard-codes the project name "nanoclaw" in
server/container identifiers (serverName in the ephemeral server creation and
string literals inside generateCloudInit); make this configurable by introducing
an APP_NAME (or similar) config/env var with a neutral default (e.g., "app") and
replace occurrences of the literal "nanoclaw" used in serverName creation, the
ssh-key comment, container name, and any mounted paths or labels inside
generateCloudInit and related logging; ensure serverName =
`${appName}-${agentId}-${Date.now()}` and all template strings inside
generateCloudInit interpolate the appName variable instead of the hard-coded
value, preserving existing behavior when the env/config value is absent.
| const serverName = `nanoclaw-${agentId}-${Date.now()}`; | ||
|
|
||
| // No host-side SSH key needed — VMs are fully managed via cloud-init + S3. | ||
| // If the agent needs git SSH keys, cloud-init generates them on the VM | ||
| // and the agent can share the pubkey back via S3 outbox. | ||
| const userData = this.generateCloudInit(agentId); | ||
|
|
||
| const { server, action } = await HetznerAPI.createServer( | ||
| serverName, | ||
| HETZNER_SERVER_TYPE, | ||
| HETZNER_IMAGE, | ||
| HETZNER_LOCATION, | ||
| [], // No SSH keys — ephemeral VM, no SSH access needed | ||
| userData, | ||
| ); | ||
|
|
||
| await HetznerAPI.waitForAction(action.id); | ||
| await HetznerAPI.waitForServerRunning(server.id); | ||
|
|
||
| logger.info( | ||
| { serverId: server.id, serverName, ip: server.public_net.ipv4.ip }, | ||
| 'Hetzner ephemeral server ready', | ||
| ); | ||
|
|
||
| return { serverId: server.id }; | ||
| } | ||
|
|
||
| private async destroyEphemeralServer(agentId: string): Promise<void> { | ||
| const serverCtx = this.servers.get(agentId); | ||
| if (!serverCtx) { | ||
| logger.warn({ agentId }, 'No Hetzner server context found to destroy'); | ||
| return; | ||
| } | ||
|
|
||
| try { | ||
| await HetznerAPI.deleteServer(serverCtx.serverId); | ||
| logger.info({ serverId: serverCtx.serverId }, 'Destroyed Hetzner ephemeral server'); | ||
| } catch (err) { | ||
| logger.warn({ serverId: serverCtx.serverId, error: err }, 'Failed to destroy Hetzner server'); | ||
| } finally { | ||
| this.servers.delete(agentId); | ||
| } | ||
| } | ||
|
|
||
| /** | ||
| * Generate cloud-init user-data for ephemeral Hetzner VMs. | ||
| * | ||
| * The VM generates its own SSH key for git operations via ssh-keygen. | ||
| * The agent can share its pubkey back to the user via S3 outbox. | ||
| * | ||
| * NOTE: B2/S3 credentials are embedded in the cloud-init script. This is acceptable | ||
| * for ephemeral VMs that are destroyed after each agent run, but be aware that: | ||
| * - Credentials may be visible in Hetzner Cloud console (server details) | ||
| * - Credentials persist in VM logs (/var/log/cloud-init.log) until VM destruction | ||
| * For higher-security deployments, consider using scoped/temporary B2 application keys | ||
| * with limited bucket permissions and short TTLs. | ||
| */ | ||
| private generateCloudInit(agentId: string): string { | ||
| return `#cloud-config | ||
| package_update: true | ||
| package_upgrade: true | ||
|
|
||
| packages: | ||
| - docker.io | ||
| - docker-compose | ||
|
|
||
| runcmd: | ||
| - systemctl start docker | ||
| - systemctl enable docker | ||
| - ssh-keygen -t ed25519 -f /root/.ssh/id_ed25519 -N "" -C "nanoclaw-${agentId}" | ||
| - ssh-keyscan github.com >> /root/.ssh/known_hosts 2>/dev/null | ||
| - docker pull ${CONTAINER_IMAGE} | ||
| - docker run -d --name nanoclaw-agent \\ | ||
| -v /root/.ssh:/home/bun/.ssh:ro \\ | ||
| -e NANOCLAW_S3_ENDPOINT=${B2_ENDPOINT} \\ | ||
| -e NANOCLAW_S3_REGION=${B2_REGION} \\ | ||
| -e NANOCLAW_S3_ACCESS_KEY_ID=${B2_ACCESS_KEY_ID} \\ | ||
| -e NANOCLAW_S3_SECRET_ACCESS_KEY=${B2_SECRET_ACCESS_KEY} \\ | ||
| -e NANOCLAW_S3_BUCKET=${B2_BUCKET} \\ | ||
| -e NANOCLAW_AGENT_ID=${agentId} \\ | ||
| ${CONTAINER_IMAGE} | ||
| `; |
There was a problem hiding this comment.
Remove hard-coded project name from server/container identifiers.
Project-specific identifiers (nanoclaw-*) should be configurable rather than embedded in source. Consider using a generic env var (e.g., APP_NAME) with a neutral default.
🛠️ Suggested change (make app name configurable)
- const serverName = `nanoclaw-${agentId}-${Date.now()}`;
+ const appName = process.env.APP_NAME || 'agent';
+ const serverName = `${appName}-${agentId}-${Date.now()}`;- - ssh-keygen -t ed25519 -f /root/.ssh/id_ed25519 -N "" -C "nanoclaw-${agentId}"
+ - ssh-keygen -t ed25519 -f /root/.ssh/id_ed25519 -N "" -C "${appName}-${agentId}"
...
- - docker run -d --name nanoclaw-agent \\
+ - docker run -d --name ${appName}-agent \\As per coding guidelines, "NEVER edit checked-in templates or source files with user-specific content like project names, directory paths, personal preferences, or org-specific details".
🤖 Prompt for AI Agents
In `@src/backends/hetzner-backend.ts` around lines 164 - 245, The code hard-codes
the project name "nanoclaw" in server/container identifiers (serverName in the
ephemeral server creation and string literals inside generateCloudInit); make
this configurable by introducing an APP_NAME (or similar) config/env var with a
neutral default (e.g., "app") and replace occurrences of the literal "nanoclaw"
used in serverName creation, the ssh-key comment, container name, and any
mounted paths or labels inside generateCloudInit and related logging; ensure
serverName = `${appName}-${agentId}-${Date.now()}` and all template strings
inside generateCloudInit interpolate the appName variable instead of the
hard-coded value, preserving existing behavior when the env/config value is
absent.
Add Hetzner Cloud backend to support ephemeral VM-based agent execution with S3-based I/O. **New files:** - `src/backends/hetzner-api.ts` - Hetzner Cloud API wrapper with server and SSH key lifecycle management - `src/backends/hetzner-backend.ts` - Hetzner backend implementation with S3 inbox/outbox pattern **Changes:** - Add 'hetzner' to BackendType union in `src/backends/types.ts` - Register HetznerBackend in `src/backends/index.ts` - Add Hetzner config vars to `src/config.ts`: - HETZNER_API_TOKEN - HETZNER_LOCATION (default: ash - Ashburn, US) - HETZNER_SERVER_TYPE (default: cpx11 - 2 vCPU, 2GB RAM) - HETZNER_IMAGE (default: ubuntu-22.04) **Architecture:** - Ephemeral VMs: Create on-demand, destroy after each agent run - S3-based I/O: Host writes to inbox, agent writes to outbox (same pattern as Railway) - Cloud-init bootstrap: Installs Docker and runs nanoclaw container on VM startup - SSH key management: Generates keypair, uploads to Hetzner, includes in server creation - Cost-effective: Hourly billing (~€0.006/hr for cpx11), VMs destroyed immediately after use **Benefits:** - Lower cost than persistent cloud backends (pay per hour used, not per month) - Simple infrastructure (standard VMs, no SaaS abstraction) - US datacenter support (Ashburn, Hillsboro) - Scales easily with different server types (cpx11, cpx21, cpx31, etc.) **Usage:** Set agent.backend = 'hetzner' and configure HETZNER_API_TOKEN + B2 credentials. **Test plan:** - [ ] Verify TypeScript compilation - [ ] Test server creation/deletion API calls - [ ] Test SSH key lifecycle - [ ] Test full agent run with S3 I/O - [ ] Verify VM destruction after completion - [ ] Test with multiple concurrent agents Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Remove host-side SSH key generation entirely — VMs are ephemeral, no SSH access needed. Cloud-init runs ssh-keygen on the VM for git operations, mounts keys into the container. Agent can share pubkey via S3 outbox. - Handle HTTP 204 empty responses in Hetzner API (DELETE operations) - Add runAgent() guard for uninitialized S3 client - Add shutdown() method to clean up ephemeral servers - Add 'hetzner' to filter_backend enum in agent MCP tools - Return explicit timeout error instead of misleading success - Replace generic HetznerResponse<T> with specific response types - Document cloud-init credential exposure for ephemeral VMs - Fix pre-existing async translateJid in WhatsApp reaction handler Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Wrap waitForAction/waitForServerRunning in try/catch with best-effort server deletion to prevent orphaned VMs on startup failure - Replace hard-coded "nanoclaw" in server/container identifiers with configurable ASSISTANT_NAME Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
41dec34 to
b7c2d1b
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In `@src/backends/hetzner-api.ts`:
- Around line 1-4: The file-level JSDoc header includes the project-specific
name "NanoClaw"; remove that name and make the header generic by changing the
comment that currently reads "Hetzner Cloud API wrapper for NanoClaw. Provides
lifecycle management for Hetzner Cloud servers (VMs)." to a neutral form such as
"Hetzner Cloud API wrapper. Provides lifecycle management for Hetzner Cloud
servers (VMs)." — update the top-of-file header/JSDoc comment so it no longer
contains any project-specific identifiers.
- Around line 25-54: The hetznerApi function currently calls fetch without a
timeout; wrap the request in an AbortController with a configurable timeout
(e.g., constant or env var) so requests can be aborted on network hangs: create
an AbortController, set a timer to call controller.abort() after the timeout,
pass controller.signal into fetch (options.signal), and clear the timer after
fetch completes; handle the abort by catching the thrown error (e.g., checking
for DOMException/AbortError) and rethrowing or translating to a clear timeout
error. Ensure changes are applied inside hetznerApi (use the existing options
RequestInit, add signal) and clean up the timeout to avoid leaks.
In `@src/backends/hetzner-backend.ts`:
- Around line 314-330: The initialize() method currently only checks B2_ENDPOINT
before instantiating NanoClawS3, so missing B2 credentials cause later errors;
add explicit checks for B2_ACCESS_KEY_ID, B2_SECRET_ACCESS_KEY, and B2_BUCKET
before creating the NanoClawS3 client, log a clear warning (similar to the
existing B2_ENDPOINT warning) and return early if any are missing so NanoClawS3
is not constructed with undefined credentials.
- Around line 164-167: The server name built in createEphemeralServer uses
ASSISTANT_NAME.toLowerCase() directly which can include invalid characters for
Hetzner; sanitize ASSISTANT_NAME before composing serverName by normalizing to
lowercase, replacing any characters not allowed by RFC1123 (allow only a-z, 0-9,
hyphen, dot), collapsing multiple invalid chars to single hyphens, trimming
leading/trailing non-alphanumeric/hyphen/dot, and ensuring length limits
(truncate if needed) before creating serverName =
`${sanitizedAppName}-${agentId}-${Date.now()}`; update createEphemeralServer to
use the sanitizedAppName and add a small unit/validation helper (e.g.,
sanitizeServerName or normalizeAppName) to centralize the logic and reuse where
needed.
| async function hetznerApi<T>( | ||
| method: string, | ||
| endpoint: string, | ||
| body?: unknown, | ||
| ): Promise<T> { | ||
| if (!HETZNER_API_TOKEN) { | ||
| throw new Error('HETZNER_API_TOKEN not set'); | ||
| } | ||
|
|
||
| const url = `${HETZNER_API_URL}${endpoint}`; | ||
| const options: RequestInit = { | ||
| method, | ||
| headers: { | ||
| 'Authorization': `Bearer ${HETZNER_API_TOKEN}`, | ||
| 'Content-Type': 'application/json', | ||
| }, | ||
| }; | ||
|
|
||
| if (body) { | ||
| options.body = JSON.stringify(body); | ||
| } | ||
|
|
||
| const resp = await fetch(url, options); | ||
|
|
||
| // Handle empty responses (204 No Content, e.g. DELETE operations) | ||
| let json: unknown = {}; | ||
| const contentLength = resp.headers.get('content-length'); | ||
| if (resp.status !== 204 && contentLength !== '0') { | ||
| json = await resp.json(); | ||
| } |
There was a problem hiding this comment.
Add request timeouts to avoid hanging Hetzner API calls.
fetch without a timeout can stall indefinitely on network issues and block backend workflows.
🛠️ Suggested fix (AbortController timeout)
- const resp = await fetch(url, options);
+ const controller = new AbortController();
+ const timeoutId = setTimeout(() => controller.abort(), 30000);
+ let resp: Response;
+ try {
+ resp = await fetch(url, { ...options, signal: controller.signal });
+ } finally {
+ clearTimeout(timeoutId);
+ }🤖 Prompt for AI Agents
In `@src/backends/hetzner-api.ts` around lines 25 - 54, The hetznerApi function
currently calls fetch without a timeout; wrap the request in an AbortController
with a configurable timeout (e.g., constant or env var) so requests can be
aborted on network hangs: create an AbortController, set a timer to call
controller.abort() after the timeout, pass controller.signal into fetch
(options.signal), and clear the timer after fetch completes; handle the abort by
catching the thrown error (e.g., checking for DOMException/AbortError) and
rethrowing or translating to a clear timeout error. Ensure changes are applied
inside hetznerApi (use the existing options RequestInit, add signal) and clean
up the timeout to avoid leaks.
… validation - Remove project-specific name from hetzner-api.ts header - Add 30s AbortController timeout to all Hetzner API fetch calls - Sanitize server names to RFC 1123 (strip invalid chars, truncate to 63) - Validate all B2 credentials (not just endpoint) before creating S3 client - Remove non-null assertions now that credentials are validated Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…omment Database improvements from stability audit: 1. **Transaction Support for deleteTask (MEDIUM)** - Wrap DELETE operations in explicit transaction - Ensures both child and parent deletions succeed atomically - Prevents partial deletion leaving orphaned task_run_logs 2. **SQL Injection Safety Documentation (HIGH)** - Add security comment to updateTask explaining safety assumptions - Document that field names are hardcoded (not user-controlled) - Warn future maintainers about SQL injection risks if logic changes Impact: - Prevents database corruption from partial task deletions - Documents security assumptions for future code reviewers - Hardens codebase against accidental SQL injection introduction Related: - Audit report: nanoclaw-stability-audit-2026-02-14.md - Issues #3, #12 from stability audit Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
* fix: stability quick wins from 2026-02-14 audit
Implements three critical stability fixes identified in the audit:
1. **Unhandled Promise Rejection Handler (CRITICAL)**
- Add process.on('unhandledRejection') to prevent crashes
- Logs rejections instead of exiting to maintain service uptime
- Prevents complete service outage from uncaught promise errors
2. **WhatsApp Event Listener Memory Leak (CRITICAL)**
- Store event handlers and remove them before reconnection
- Prevents exponential handler accumulation on reconnects
- Fixes memory leak leading to eventual OOM crashes
3. **Group Folder Path Traversal (MEDIUM)**
- Validate folder names with regex (alphanumeric + _ -)
- Verify resolved paths stay within groups directory
- Prevents malicious group registration from writing to arbitrary paths
Impact:
- Prevents process crashes from unhandled rejections
- Fixes production memory leak in WhatsApp channel
- Hardens security against path traversal attacks
Related:
- Audit report: nanoclaw-stability-audit-2026-02-14.md
- Issues #1, #4, #16 from stability audit
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
* fix: add transaction support to deleteTask and SQL injection safety comment
Database improvements from stability audit:
1. **Transaction Support for deleteTask (MEDIUM)**
- Wrap DELETE operations in explicit transaction
- Ensures both child and parent deletions succeed atomically
- Prevents partial deletion leaving orphaned task_run_logs
2. **SQL Injection Safety Documentation (HIGH)**
- Add security comment to updateTask explaining safety assumptions
- Document that field names are hardcoded (not user-controlled)
- Warn future maintainers about SQL injection risks if logic changes
Impact:
- Prevents database corruption from partial task deletions
- Documents security assumptions for future code reviewers
- Hardens codebase against accidental SQL injection introduction
Related:
- Audit report: nanoclaw-stability-audit-2026-02-14.md
- Issues #3, #12 from stability audit
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
---------
Co-authored-by: NanoClaw Agent <nanoclaw@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Summary
Add Hetzner Cloud backend to support ephemeral VM-based agent execution with S3-based I/O. This provides a cost-effective alternative to persistent cloud backends like Sprites/Railway by creating VMs on-demand and destroying them immediately after use.
New files
src/backends/hetzner-api.tssrc/backends/hetzner-backend.tsChanges
'hetzner'toBackendTypeunion insrc/backends/types.tsHetznerBackendinsrc/backends/index.tssrc/config.ts:HETZNER_API_TOKEN- API token from Hetzner Cloud ConsoleHETZNER_LOCATION- Datacenter location (default:ash- Ashburn, US)HETZNER_SERVER_TYPE- VM size (default:cpx11- 2 vCPU, 2GB RAM, €4.15/mo)HETZNER_IMAGE- OS image (default:ubuntu-22.04)Architecture
Ephemeral VM Lifecycle
S3-based I/O (same pattern as Railway)
SSH Key Management
Cloud-init Startup Script
Benefits
Cost-Effective
Simple Infrastructure
US Datacenter Support
Scalable
HETZNER_SERVER_TYPEenv varUsage
Set agent backend to 'hetzner' and configure required env vars:
Test plan
bun run buildcompiles cleanlyMigration Path
As discussed with @future Trees:
🤖 Generated with Claude Code
Summary by CodeRabbit