Skip to content

Comments

source-shopify-native: support multiple Shopify stores#3884

Open
JustinASmith wants to merge 1 commit intomainfrom
js/source-shopify-native_multi-store-support
Open

source-shopify-native: support multiple Shopify stores#3884
JustinASmith wants to merge 1 commit intomainfrom
js/source-shopify-native_multi-store-support

Conversation

@JustinASmith
Copy link
Contributor

@JustinASmith JustinASmith commented Feb 9, 2026

Description:

Summary

Adds multi-store support to source-shopify-native, allowing a single capture to replicate data from multiple Shopify stores. Each store authenticates independently via Access Token with its own rate limiter and HTTP session.

Key changes

  • Multi-store config: New StoreConfig model wrapping per-store store name and credentials. Legacy single-store configs ({"store": "...", "credentials": {...}}) are auto-migrated to the new stores list format via a Pydantic mode="wrap" validator, with the original config persisted on next open via config_update event.
  • Composite collection keys: All collections now use ["/_meta/store", "/id"] to ensure cross-store document uniqueness. Discover always proposes composite keys for forward compatibility. Legacy captures may have mixed keys — old bindings retain ["/id"] while newly discovered bindings get ["/_meta/store", "/id"] — reducing the number of bindings requiring backfill if a store is added later.
  • Backfill validation: Adding multiple stores to an existing capture requires backfill acknowledgment for legacy ["/id"] bindings so the key structure can change to include the store identifier.
  • Dict-based state: Resource state is always keyed by store ID ({"inc": {"store-id": {"cursor": "..."}}}). Legacy flat state is automatically migrated during open and checkpointed.
  • Per-store isolation: StoreHTTP wrapper shares the parent aiohttp.ClientSession for connection pooling while providing independent RateLimiter and TokenSource per store. Store initialization and credential validation run in parallel via asyncio.gather.
  • Store context injection: StoreValidationContext dataclass is passed through Pydantic's validation context to inject the _meta.store field on every document. A mode="before" model validator combined with validate_assignment=True ensures the field survives model_dump(exclude_unset=True).
  • OAuth temporarily removed: Will revisit once the OAuth app is approved and UI/runtime supports OAuth inside array items with per-store credentials.
  • Circular import fix: Extracted dt_to_str/str_to_dt into utils.py and removed the eager bulk_job_manager import from graphql/__init__.py to break the cycle: models.pygraphql/__init__.pybulk_job_managermodels.py.
  • StoreInitError: fail-fast behavior documented with TODO for graceful degradation.

Other improvements

  • BulkJobManager: public check_connectivity() method (replacing private _get_running_jobs() usage for validation), cancel timeout using time.monotonic(), store-prefixed log messages, typo fix.

Closes #3508

Workflow steps:

(How does one use this feature, and how has it changed)

Documentation links affected:

(list any documentation links that you created, or existing ones that you've identified as needing updates, along with a brief description)

TODO: Needs updated documentation PR.

Notes for reviewers:

Test plan

  • poetry run pytest tests/test_state_migration.py — 41 tests covering:
    • State migration (flat-to-dict, backfill, idempotency, multi-store addition, removed store preservation, error cases)
    • Binding key validation (multi-store transition, backfill acknowledgment, new binding not flagged)
    • _should_use_store_in_key (all branches including key regression prevention)
    • EndpointConfig migration (legacy format, multi-store, duplicate rejection)
    • StoreValidationContext (serialization, exclude_unset, JSON parsing, existing meta preserved)
  • poetry run pytest tests/test_snapshots.py --insta=update — spec, discover, capture snapshots pass
  • Manual test with multi-store config against live Shopify stores
  • Verify legacy single-store capture continues working without backfill
  • Verify adding a second store triggers backfill validation error

@JustinASmith JustinASmith force-pushed the js/source-shopify-native_multi-store-support branch 2 times, most recently from 2017075 to 75f2634 Compare February 10, 2026 14:44
@JustinASmith JustinASmith marked this pull request as ready for review February 10, 2026 14:53
Copy link
Contributor

@nicolaslazo nicolaslazo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't had to deal with config migrations yet so my experience is limited here, but the logic looks solid 👍

Adds multi-store support allowing a single capture to replicate data
from multiple Shopify stores. Each store uses Access Token
authentication with independent rate limiting and HTTP sessions.

Config & models:
- StoreConfig model with per-store credentials (Access Token only)
- Legacy single-store configs auto-migrated via wrap model_validator
- Duplicate store name validation
- OAuth temporarily removed; will revisit when Flow/UI supports
  per-store OAuth (see git history for previous OAUTH2_SPEC)

Document keys & state:
- Collection keys use ["/_meta/store", "/id"] for cross-store uniqueness
- Discover always proposes composite keys for forward compatibility;
  legacy captures may have mixed keys (old bindings retain ["/id"],
  new bindings get composite) to reduce backfill scope on store addition
- Validation requires backfill when transitioning to multi-store keys
- Dict-based state format keyed by store ID with automatic migration
  from legacy flat format during open

Store context:
- StoreValidationContext dataclass injected via Pydantic validation
  context; mode="before" validator + validate_assignment=True ensures
  store field survives model_dump(exclude_unset=True)
- Per-store HTTP wrapper (StoreHTTP) shares parent session for
  connection pooling with independent rate limiters
- Parallel store initialization and credential validation via
  asyncio.gather
- Fail-fast on any store init failure (TODO: graceful degradation)

Infrastructure:
- Extracted dt_to_str/str_to_dt into utils.py to break circular import
  chain: models.py -> graphql/__init__.py -> bulk_job_manager -> models
- Removed eager bulk_job_manager import from graphql/__init__.py;
  BulkJobManager now imported directly where needed
- BulkJobManager: public check_connectivity() method, cancel timeout
  using time.monotonic(), store-prefixed log messages
- GraphQL client accepts typed StoreValidationContext for document
  parsing

Tests:
- State migration: flat-to-dict, idempotency, multi-store addition,
  removed store preservation, error cases
- Binding key validation: multi-store transition, backfill
  acknowledgment, new binding handling
- _should_use_store_in_key: all branches including key regression
- EndpointConfig migration: legacy format, multi-store, duplicate
  rejection
- StoreValidationContext: serialization, exclude_unset, JSON parsing

Closes: #3508
@JustinASmith JustinASmith force-pushed the js/source-shopify-native_multi-store-support branch from 6bc377f to 88c5222 Compare February 24, 2026 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

source-shopify-native: support multiple shopify stores for one capture

2 participants