CLI utilities for inspecting and plotting Trackio training data.
Supports both parquet and SQLite (.db) files, including files hosted on HuggingFace, Modal volumes, or SSH servers.
No install needed except uv. The script is a uv inline script and manages its own dependencies.
Print a summary of all runs in a project: row counts, step ranges, timestamps, tracked metrics, and media file counts.
# Local parquet file
./trackio-tool.py analyze my-project.parquet
# Local SQLite file
./trackio-tool.py analyze my-project.db
# HuggingFace dataset
./trackio-tool.py analyze hf://my-org/my-dataset/my-project.parquet
# Modal volume
uv run --with modal ./trackio-tool.py analyze modal://my-volume/trackio/my-project.db
# Modal volume with explicit environment
uv run --with modal ./trackio-tool.py analyze modal://my-volume@dev/trackio/my-project.parquet
# SSH server (absolute path)
uv run --with paramiko ./trackio-tool.py analyze ssh://my-server/data/trackio/my-project.db
# SSH server (path relative to home directory)
uv run --with paramiko ./trackio-tool.py analyze ssh://user@my-server/~/trackio/my-project.dbExample output:
Project: my-project
Total rows: 673
Metrics: loss/mymodel, eval/mymodel_psnr, eval/mymodel_lpips
dainty-meadow-878c
Rows: 587 Steps: 1 – 33187
First: 2026-02-05 12:46 Last: 2026-02-05 16:01
Columns: loss/mymodel, eval/mymodel_psnr, eval/mymodel_lpips
Media: 12 file(s)
Use --no-media to skip counting media files (useful for remote sources where listing is slow).
Plot numeric metrics for one or more data files. Produces a stacked column of charts, one per metric.
# Single file, all runs
./trackio-tool.py plot my-project.db
# Filter specific runs
./trackio-tool.py plot my-project.db --run dainty-meadow-878c,autumn-willow-d122
# Custom output path (default: plot.webp)
./trackio-tool.py plot my-project.db -o training.pngPass multiple data files as a comma-separated list to plot runs from different projects together:
./trackio-tool.py plot project-a.db,project-b.dbWhen run names are unique across projects, they appear as-is in the legend. When the same run name exists in multiple projects, it is automatically qualified as project:run-name.
You can filter runs with --run using either syntax:
# Bare run name (must be unambiguous across all loaded projects)
./trackio-tool.py plot a.db,b.db --run calm-river-a3f2
# Qualified name (required when a run name exists in multiple projects)
./trackio-tool.py plot a.db,b.db --run a:calm-river-a3f2,b:bright-dawn-b1c7If a bare run name is ambiguous, the tool errors with a message listing the matching projects.
Combine runs from one project file into an existing .db file. Supports .db-to-.db, .parquet-to-.db, HuggingFace, and Modal sources.
# Merge from another local DB
./trackio-tool.py merge --from project-b.db --into project-a.db
# Merge from a parquet file (imports companion _system.parquet and _configs.parquet too)
./trackio-tool.py merge --from project-b.parquet --into project-a.db
# Merge from a HuggingFace dataset
./trackio-tool.py merge --from hf://my-org/my-dataset/project-b.parquet --into project-a.db
# Merge from a Modal volume
uv run --with modal ./trackio-tool.py merge --from modal://my-volume/trackio/project-b.db --into project-a.db
# Merge from an SSH server
uv run --with paramiko ./trackio-tool.py merge --from ssh://my-server/data/trackio/project-b.db --into project-a.db
# Import only specific runs
./trackio-tool.py merge --from project-b.db --into project-a.db --run calm-river-a3f2,bright-dawn-b1c7
# Skip media file copying
./trackio-tool.py merge --no-media --from project-b.db --into project-a.dbThe command:
- Errors if any run names overlap between source and target.
- Copies all three tables (
metrics,configs,system_metrics), creating missing tables as needed. - Use
--runto import only specific runs (comma-separated). By default all runs are imported. - By default (
--media) copies media directories frommedia/<project>/<run>/next to the source into the corresponding location next to the target. Use--no-mediato skip this. The--runfilter applies to media as well. - For HF sources, downloads companion parquet files and media from the same dataset repo.
- For Modal sources, downloads files from the volume. Requires
modal(uv run --with modal). - For SSH sources, downloads files via SFTP. Requires
paramiko(uv run --with paramiko).
To store Trackio data on a Modal volume, set the TRACKIO_DIR environment variable in your Modal app to a path on the volume (e.g. /vol/trackio). Trackio will write its data files there, and you can then access them with:
uv run --with modal ./trackio-tool.py analyze modal://my-volume/trackio/my-project.dbAccess Trackio data on any SSH-accessible machine using ssh://[user@]host/path URLs. The path after the host is absolute on the remote machine. Use /~/ to specify a path relative to the user's home directory. Authentication uses your SSH agent and default keys (same as regular ssh usage).
# Absolute path
uv run --with paramiko ./trackio-tool.py analyze ssh://my-server/data/trackio/my-project.db
# Relative to home directory, e.g. default trackio location
uv run --with paramiko ./trackio-tool.py analyze ssh://user@my-server/~/.cache/huggingface/trackio/my-project.dbRemove specific runs from a local .db file.
# Drop a single run
./trackio-tool.py drop my-project.db --run calm-river-a3f2
# Drop multiple runs
./trackio-tool.py drop my-project.db --run calm-river-a3f2,bright-dawn-b1c7
# Drop runs but keep their media directories
./trackio-tool.py drop my-project.db --run calm-river-a3f2 --no-mediaThe command deletes matching rows from all three tables (metrics, configs, system_metrics) and by default (--media) also removes the corresponding media/<project>/<run>/ directories and reports the number of files deleted. Use --no-media to keep media on disk.
Coding agents are used to assist with the development of this tool (in other words it is mostly vibe-coded).
Linting, formatting and type checking:
uv sync --all-extras
uv run ruff check
uv run ruff format
uv run pyright