Skip to content

npow/metaflow-serverless

Repository files navigation

metaflow-serverless

CI PyPI License: Apache-2.0 Python 3.10+ Docs

Serverless Metaflow metadata service — free-tier Postgres, zero setup.

The problem

Running Metaflow beyond your laptop means standing up a metadata service, a database, and object storage — typically on AWS, with always-on costs and hours of infra setup. For indie developers and small teams, that overhead kills momentum before the first experiment runs.

Existing alternatives either require a paid Outerbounds subscription or leave you managing servers that cost money even when idle.

Quick start

pip install metaflow-serverless
mf-setup
python flow.py run

mf-setup provisions everything on free-tier serverless infrastructure and writes Metaflow config to ~/.metaflowconfig/config.json. Authenticate first with supabase login.

Install

pip install metaflow-serverless

Requires Python 3.10+. Metaflow must be installed separately:

pip install metaflow

Usage

Provision a new service

mf-setup

Walks you through choosing a provider stack, installs any needed CLIs, and provisions the database, storage, and compute layer. Writes credentials to ~/.metaflowconfig/config.json.

Supabase auth and config (important)

Authenticate the Supabase CLI before running setup:

supabase login
supabase projects list --output json

Then run:

mf-setup

The wizard writes Metaflow global config to:

~/.metaflowconfig/config.json

Supabase quota guardrails

For compute=supabase, the deployed metadata-router edge function enforces a monthly quota budget before forwarding requests. By default it uses:

  • MF_MONTHLY_REQUEST_LIMIT=500000
  • MF_MONTHLY_EGRESS_LIMIT_BYTES=5368709120 (5 GiB)

These defaults are intended to keep a single project within the current Supabase Free plan quotas for Edge Function invocations and uncached egress. Override them at deploy time by exporting:

export MF_MONTHLY_REQUEST_LIMIT=250000
export MF_MONTHLY_EGRESS_LIMIT_BYTES=$((2 * 1024 * 1024 * 1024))
export MF_QUOTA_SCOPE=my-project
mf-setup

When the quota is exhausted, the edge function returns HTTP 429 instead of continuing to forward requests.

Note: Supabase applies usage quotas at the organization level. This guardrail is enforced per deployed metadata service, so it is a conservative approximation rather than a replacement for checking the Supabase usage page.

Supabase CLI v2 notes

  • The setup flow is compatible with recent Supabase CLI versions where:
    • supabase db url --project-ref ... is removed.
    • supabase storage create ... --project-ref ... is removed.
  • Migrations are applied directly via asyncpg and trigger a PostgREST schema reload.
  • Edge function deploy uses a temporary supabase/functions/<name>/ staging layout required by current CLI packaging.

Supabase S3 credentials

mf-setup automatically provisions HMAC S3 credentials for your project and registers them in the database. No manual key management is needed. Credentials are written to ~/.metaflowconfig/config.json and used for all artifact reads/writes.

Run a flow

# flow.py
from metaflow import FlowSpec, step

class MyFlow(FlowSpec):
    @step
    def start(self):
        self.data = [1, 2, 3]
        self.next(self.end)

    @step
    def end(self):
        print(self.data)

if __name__ == "__main__":
    MyFlow()
python flow.py run

Metadata and artifacts are recorded in your provisioned service automatically.

Open the UI

mf-ui

Starts a local proxy on localhost:8083 that downloads and serves the Metaflow UI, backed by your remote service. Supports flows list, run detail, DAG view, timeline view, task detail (attempt history, duration, status), and log streaming.

How it works

metaflow-serverless replaces the Metaflow metadata service with a serverless stack — no persistent server required:

Metaflow client
  → Supabase Edge Function (URL router, ~40 lines TypeScript)
  → PostgREST (managed by Supabase)
  → PL/pgSQL stored procedures (heartbeat, tag mutation, artifact queries)
  → Postgres tables

All business logic lives in the database as stored procedures. The compute layer scales to zero and wakes in milliseconds. Artifacts are stored in S3-compatible object storage.

Compute mode details

  • With compute=supabase, mf-setup deploys a Supabase Edge Function named metadata-router (not the netflixoss/metaflow-metadata-service container image).
  • The Edge Function handles HTTP routing and request shaping; metadata behavior lives in SQL procedures in Postgres.
  • With compute=cloud-run or compute=render, mf-setup deploys the netflixoss/metaflow-metadata-service container image.

Comparison: metaflow-serverless vs metaflow-local-service

Dimension metaflow-serverless metaflow-local-service
Backend Remote service (Supabase/Cloud Run/Render + Postgres) Local daemon on 127.0.0.1 backed by .metaflow/
Cross-machine visibility Yes No (single machine by default)
Heartbeat tracking Persisted in remote DB In-memory daemon liveness only
Best use case Shared/dev-prod-like metadata backend Fast local iteration and CI
Remote Daytona/E2B observability Recommended for end-to-end validation Not equivalent to a shared remote backend

Provider stacks

Stack Accounts needed Cold start Storage
Supabase (default) 1, email only, no CC ~0ms 1 GB free
Neon + Cloudflare R2 2, CC for R2 ~1-4s 10 GB free
CockroachDB + Backblaze B2 2, phone for B2 ~1-4s 10 GB free

Development

git clone https://github.com/npow/metaflow-serverless
cd metaflow-serverless
pip install -e ".[dev]"
pre-commit install

Run tests with coverage:

pytest --cov --cov-report=term-missing

Lint and format:

ruff check .
ruff format .

Type-check:

mypy src/

License

Apache 2.0

About

Serverless Metaflow metadata service — free-tier Postgres, zero setup

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors