Skip to content

Import OpenClaw memory, history and settings #58

@ilblackdragon

Description

@ilblackdragon

Import OpenClaw Memory, History and Settings

Allow IronClaw to take over an existing OpenClaw deployment's data — memory documents, conversation history, and settings. Should integrate into the onboarding wizard as a detection + import step.

OpenClaw Data Layout

OpenClaw stores data at:

  • Config: `~/.openclaw/openclaw.json` (JSON5 format with comments)
  • Memory/RAG: `~/.openclaw/memory/{agentId}.sqlite` (SQLite + sqlite-vec + FTS5)
  • State: `~/.openclaw-state/` (runtime state)

OpenClaw's memory system uses a local-first RAG architecture:

  • Canonical source: Markdown files in workspace directories
  • Derived index: SQLite database with `vec0` virtual table (vector embeddings) + FTS5 (keyword search)
  • Hybrid search via vector + keyword scoring

IronClaw Target Schema

OpenClaw IronClaw Mapping
`openclaw.json` (JSON5) `settings` table (dotted keys) Parse JSON5, flatten to key-value pairs
Memory SQLite (`vec0` embeddings) `memory_chunks` (pgvector / F32_BLOB) Convert binary vec0 → float array
Memory SQLite (FTS5) `memory_chunks` FTS index Re-index via triggers on insert
Session tokens in JSON `secrets` table (AES-256-GCM) Encrypt on import
Workspace Markdown files `memory_documents` table Direct path + content insert
Agent identity files Well-known workspace paths Map AGENTS.md, IDENTITY.md, etc.
Conversation history `conversations` + `conversation_messages` Reconstruct from SQLite records

Design Considerations

  1. Detection in onboarding: Add a step to the setup wizard (src/setup/wizard.rs) that checks for `~/.openclaw/openclaw.json`. If found, offer to import. Don't auto-import — the user must opt in.

  2. JSON5 parsing: OpenClaw uses JSON5 (comments, trailing commas). Use the `json5` crate to parse, then map to IronClaw's dotted-key settings format.

  3. Embedding format conversion: OpenClaw uses sqlite-vec's `vec0` virtual table which stores embeddings as binary blobs. IronClaw uses `pgvector` (PostgreSQL) or `F32_BLOB` (libSQL). Need to:

    • Read binary embedding from vec0
    • Convert to `Vec`
    • Insert in target format
    • If embedding dimensions differ (OpenClaw may use different model), flag for re-embedding
  4. Re-embedding option: If the user switches embedding providers (e.g., OpenClaw used OpenAI text-embedding-ada-002, IronClaw uses text-embedding-3-small), offer to re-embed all documents. Use the existing backfill mechanism (runs on startup when embeddings provider is enabled).

  5. Both backends: Import must work into both PostgreSQL and libSQL targets. Use the `Database` trait methods, not raw SQL.

  6. Incremental import: Support re-running import without duplicating data. Use document paths as natural deduplication keys (`memory_documents.path` is unique per user).

  7. Credential safety: OpenClaw may store API keys in plaintext JSON. During import, encrypt them into IronClaw's `secrets` table. Never log credential values.

  8. Multi-agent: OpenClaw uses per-agent SQLite files (`{agentId}.sqlite`). IronClaw uses a single database with `agent_id` column. Map each OpenClaw agent file to a distinct `agent_id` in IronClaw.

Requirements

  • CLI command: `ironclaw import openclaw [--path ~/.openclaw] [--dry-run]`
  • Onboarding wizard step: detect OpenClaw installation, prompt for import
  • Parse `openclaw.json` (JSON5) → IronClaw settings
  • Import memory documents (Markdown files → `memory_documents` table)
  • Import memory chunks with embeddings (sqlite-vec → pgvector/F32_BLOB)
  • Import conversation history → `conversations` + `conversation_messages`
  • Encrypt and import credentials → `secrets` table
  • Import identity files (AGENTS.md, IDENTITY.md, SOUL.md, USER.md, etc.)
  • Handle embedding dimension mismatch (flag for re-embedding)
  • Dry-run mode showing what would be imported without writing
  • Progress reporting (X documents, Y conversations, Z secrets imported)
  • Idempotent: re-running import skips already-imported data

Success Criteria

  • User with an existing OpenClaw deployment can run `ironclaw import openclaw` and have all memory, settings, and conversations available in IronClaw
  • Hybrid search (FTS + vector) works on imported documents without manual re-indexing
  • Identity files appear at correct well-known paths and are injected into LLM system prompt
  • No plaintext credentials in logs or intermediate state during import
  • Import works into both PostgreSQL and libSQL backends
  • Dry-run accurately reports what would be imported
  • Import of a ~1000-document OpenClaw workspace completes in under 60 seconds (excluding re-embedding)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions