Skip to content

Architecture: Refactor to Manager-Worker model for Multi-Identity & Scalability #86

@josephaw1022

Description

@josephaw1022

Description

Currently, the KubeSQLServer-Operator operates using a single controller that handles the reconciliation logic for all SQLServer and ExternalSQLServer resources. While efficient for basic setups, this architecture hits a limitation when dealing with Workload Identity or IAM Roles. A single controller instance cannot easily assume multiple distinct identities to connect to various SQL Servers across different security boundaries.

We propose refactoring the operator to a Manager-Worker architecture.

Proposed Architecture: Manager-Worker Model

The main operator controller will act as a Manager. For every SQL Server instance (in-cluster or external) that needs management, the Manager will deploy and maintain a dedicated Worker Pod.

Key Benefits:

  • Identity Isolation: Each Worker Pod can run under its own ServiceAccount with specific Workload Identity or IAM Role bindings.
  • Scalability: Management tasks are distributed across pods rather than tax a single controller's reconcile loop.
  • Security: Authentication credentials (or tokens) are localized to the pod managing that specific instance.

Workflow:

  1. User creates a SQLServer or ExternalSQLServer CRD.
  2. The Main Controller (Manager) reconciles the CRD and creates a Worker Pod.
  3. Authentication details are passed to the Worker Pod via environment variables:
    • Basic Auth: SQL_USER, SQL_PASSWORD (from secrets).
    • Identity-based Auth: AUTH_METHOD=WorkloadIdentity (Worker pod uses its own assigned identity).
  4. The Worker Pod connects to its assigned SQL Server and performs all DDL/DCL operations (creating databases, schemas, logins, users).

Visual Representation

graph TD
    subgraph "Kubernetes Cluster"
        M[Main Controller - Manager] -- manages --> W1[Worker Pod - Instance A]
        M -- manages --> W2[Worker Pod - Instance B]
    end

    subgraph "Target SQL Servers"
        W1 -- "Auth: Basic/Workload Identity" --> S1[(SQL Instance A)]
        W2 -- "Auth: Basic/Workload Identity" --> S2[(SQL Instance B)]
    end
Loading

Why is this needed?

This refactoring is essential for supporting modern cloud-native authentication methods and ensuring that the operator can scale securely in complex enterprise environments.

Implementation Notes:

  • Consider using a Sidecar or a Job-based approach if long-running workers aren't required, though a long-running pod might be better for continuous reconciliation.
  • Define a standard "Worker" image that contains the minimal logic for SQL management.
  • Ensure the Manager can track Worker Pod health and restart them if they fail.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions