-
Notifications
You must be signed in to change notification settings - Fork 3k
Description
Motivation
The exo command starts a node — it's a long-running daemon. There's currently no CLI tool for managing a running cluster: checking status, loading/unloading models, monitoring downloads, etc.
This is the same pattern as kubectl (manages k8s) vs kubelet (runs a node), or obsidian-cli vs the Obsidian desktop app. The daemon and the management tool are fundamentally different entrypoints.
Proposal
Add exo-cli as a separate entrypoint that talks to a running exo cluster over HTTP:
exo-cli status # Cluster overview (nodes, models, memory)
exo-cli health # Quick liveness check
exo-cli nodes # List all nodes
exo-cli nodes <id> # Single node detail
exo-cli models # Loaded models + downloads
exo-cli models status <name> # Poll model readiness
exo-cli models load <name> # Load model (auto-placement)
exo-cli models load --wait <name> # Load + block until ready
exo-cli models unload <name> # Unload by name
exo-cli models swap <old> <new> # Atomic unload-then-load
exo-cli models swap --wait <old> <new> # Swap + block until new model readyKey features
--waitflag — blocks until async operations complete (model loaded, swap finished). Eliminates polling loops in scripts.--jsonflag — machine-readable output for piping intojqor other tools--host/--port— connect to any node in the cluster (defaults tolocalhost:52415)- Human-friendly table output by default, JSON when scripting
Example: day/night model rotation cron
#!/bin/bash
# 11pm: swap to large model for overnight batch work
exo-cli models swap --wait "Qwen3-30B-A3B-4bit" "mlx-community/MiniMax-M1-80B-A45B-4bit"
# Run batch inference...
curl -X POST http://localhost:52415/v1/chat/completions -d '{...}'
# 6am: swap back to fast model
exo-cli models swap --wait "MiniMax-M1-80B-A45B-4bit" "mlx-community/Qwen3-30B-A3B-4bit"Implementation
The CLI would be a thin HTTP client against the /v1/cluster/* endpoints proposed in #1727. Separate entrypoint in pyproject.toml:
[project.scripts]
exo = "exo.main:main"
exo-cli = "exo.cli.main:main"No new dependencies beyond what exo already has (httpx or urllib for HTTP, argparse for CLI parsing).
Relationship to #1727
This depends on the cluster management API endpoints in #1727. The CLI is purely a client — it doesn't touch any server-side code. The two PRs can be reviewed independently.