Skip to content

Enable model-registry with UI by default#3318

Open
Raakshass wants to merge 15 commits intokubeflow:masterfrom
Raakshass:enable-model-registry-ui
Open

Enable model-registry with UI by default#3318
Raakshass wants to merge 15 commits intokubeflow:masterfrom
Raakshass:enable-model-registry-ui

Conversation

@Raakshass
Copy link
Copy Markdown
Contributor

@Raakshass Raakshass commented Jan 4, 2026

Summary of Changes

This PR enables the Model Registry server, UI, and demo catalog components in the default Kubeflow installation (example/kustomization.yaml), updates the Central Dashboard to include a Model Registry menu entry, adds README documentation, and adds CI tests with model CRUD verification.

Components added to example/kustomization.yaml:

  • Model Registry Server with PostgreSQL database (overlays/postgres)
  • Model Registry Istio networking / VirtualService (options/istio)
  • Model Registry UI with Istio integration (options/ui/overlays/istio)
  • Model Catalog demo (options/catalog/overlays/demo)

Central Dashboard:

  • Updated applications/centraldashboard/overlays/oauth2-proxy/kustomization.yaml to use istio base overlay instead of kserve
  • Added patches/configmap.yaml with Model Registry menu entry alongside existing KServe Endpoints entry

CI / Testing:

  • Added tests/model_registry_install.sh — installs Model Registry server, UI, database, Istio networking, and catalog
  • Added tests/model_registry_test.sh — CRUD tests (creates RegisteredModel, ModelVersion, ModelArtifact, verifies listing) + Istio gateway auth tests
  • Updated .github/workflows/model_registry_test.yaml to run install and test scripts

Documentation:

  • Added "Model Registry" section to README.md under "Install Individual Components"

Dependencies

No external dependencies. Uses existing upstream manifests from applications/model-registry/.

Related Issues

Closes #3047

@github-actions
Copy link
Copy Markdown

github-actions bot commented Jan 4, 2026

Welcome to the Kubeflow Manifests Repository

Thanks for opening your first PR. Your contribution means a lot to the Kubeflow community.

Before making more PRs:
Please ensure your PR follows our Contributing Guide.
Please also be aware that many components are synchronizes from upstream via the scripts in /scripts.
So in some cases you have to fix the problem in the upstream repositories first, but you can use a PR against kubeflow/manifests to test the platform integration.

Community Resources:

Thanks again for helping to improve Kubeflow.

@Raakshass
Copy link
Copy Markdown
Contributor Author

hey @juliusvonkohout can you just review this pr.
Thank you

@juliusvonkohout
Copy link
Copy Markdown
Member

juliusvonkohout commented Jan 11, 2026

hey @juliusvonkohout can you just review this pr. Thank you

I am still on vacation, but maybe @tarilabs can help sooner.

Are you sure that the catalog and everything is properly exposed in the dashboard UI @Raakshass? Do you mind sharing screenshots? Think of how we expose Kserve models web application in the dashboard.

@juliusvonkohout
Copy link
Copy Markdown
Member

juliusvonkohout commented Jan 11, 2026

@Raakshass are you sure that it is properly exposed similar to the kserve models web application (endpoints) in the dashboard UI? I would like to see screenshots of the dashboard and the actual UI changes you made. Please check the original issue and related ones in the Model-Registry git repository. I think you are missing 80% of the work.

@tarilabs
Copy link
Copy Markdown
Member

could you kindly share screenshot with @ederign as Julius suggested please on this thread?

@sameerdattav
Copy link
Copy Markdown
Contributor

Hey @ederign @juliusvonkohout @tarilabs,

I’ve been following this PR and the related issue for a few days and thought I could jump in to help move things forward.
So I went ahead and opened a fresh PR that includes all the required changes along with validation screenshots:

#3323

I’d really appreciate a review when you get a chance. Thanks!

@ederign
Copy link
Copy Markdown
Member

ederign commented Jan 12, 2026

I've commented on #3323

@Raakshass
Copy link
Copy Markdown
Contributor Author

Hi @juliusvonkohout @tarilabs — addressing the feedback about showing the actual dashboard/UI change.

What changed in this update

  • Added a Central Dashboard overlay: applications/centraldashboard/overlays/model-registry/
  • Added a JSON6902 patch that appends a new dashboard menu entry pointing to /model-registry/
    • Value added: {"text": "Model Registry", "link": "/model-registry/"}

Why this change

Kubeflow’s documentation for Model Registry installation and dashboard customization indicates the Model Registry entry should be added to the Central Dashboard configuration so it appears in the sidebar menu.

Verification status

  • This PR is focused on manifests wiring (dashboard link + overlays).
  • Local end-to-end screenshots are still pending; will follow up with real deployment verification + screenshots once the deployment environment is ready.

If you’d like the menu item to also include type/icon fields (as in the docs examples), please confirm the preferred values and I can update it accordingly.

@juliusvonkohout
Copy link
Copy Markdown
Member

juliusvonkohout commented Jan 13, 2026

I think you can use a general named one called applications/centraldashboard/overlays/kustomization.yaml

We should also merge https://github.com/kubeflow/manifests/blob/master/applications/centraldashboard/overlays/oauth2-proxy/kustomization.yaml into that because oauth2-proxy is anyway mandatory.

@Raakshass
Copy link
Copy Markdown
Contributor Author

@juliusvonkohout Refactor complete!
I've consolidated the oauth2-proxy and model-registry overlays into a single applications/centraldashboard/overlays/kustomization.yaml as requested. Also switched to a Strategic Merge Patch to fix the JSON syntax error.
Screenshot 2026-01-14 001428
Dashboard link is verified locally (screenshot attached).

@Raakshass
Copy link
Copy Markdown
Contributor Author

Hi @juliusvonkohout @kimwnasptd,

I wanted to follow up on this PR. I noticed it's listed as a related issue for GSoC 2026 Project 4 (Platform Scalability and Security) - which is exciting!

Is there anything else needed from my side to move this forward? Happy to make any additional changes.

Thanks for your time!

- Resolved conflicts in model_catalog_test.yaml, model_registry_test.yaml, and README.md

- Kept composite action usage and Model Registry README section from PR branch

- Incorporated all upstream master changes

Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com>
- Fix yamllint violation: change indentation from 4 to 2 spaces for list items under steps (indent-sequences: false)

- Remove nonexistent kubectl_install.sh step — kubectl is already installed by install_KinD_create_KinD_cluster_install_kustomize.sh

- Now matches master's exact 8-step setup pattern

Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com>
@Raakshass
Copy link
Copy Markdown
Contributor Author

@juliusvonkohout All open comments have been addressed:

Copilot comments:

  • README section added (commit 6d7ad0ae)
  • Istio VirtualService already included (commit d16621d)
  • Test URLs confirmed intentional (UI BFF route, not raw API)

@manaswinidas comments:

  • Renamed "Model Registry Catalog" → "Model Catalog" (commit 309edbb)

CI deduplication:

  • Extracted shared setup into composite action .github/actions/setup-kubeflow-base/action.yaml (commit 6d7ad0ae, fix in c9ccbdc)

All 8 CI checks are green. Ready for review when convenient.

Signed-off-by: Siddhant Jain siddhantjainofficial26@gmail.com

@Raakshass Raakshass marked this pull request as ready for review March 8, 2026 02:48
- Remove .github/actions/setup-kubeflow-base/ folder (separate PR planned)
- Revert model_catalog_test.yaml to inline setup steps (match master)
- Revert model_registry_test.yaml to inline setup steps (match master)
- Add RegisteredModel, ModelVersion, ModelArtifact creation to test.sh
- Verify created model appears in API listing

Addresses reviewer feedback to remove the extra folder and ensure tests
cover basic user interactions like creating a dummy model.

Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com>
@google-oss-prow google-oss-prow bot added size/L and removed size/XL labels Mar 10, 2026
- Move tests/model_registry_test/install.sh to tests/model_registry_install.sh
- Move tests/model_registry_test/test.sh to tests/model_registry_test.sh
- Delete tests/model_registry_test/ subfolder (consistency with master)
- Fix ModelVersion POST: add registeredModelId to request body (fixes 422)
- Fix ModelArtifact POST: add modelVersionId to request body
- Update model_registry_test.yaml workflow to reference flat test files
- Drop extra Spark details link from README (matches master)
- Update README Model Registry section to use flat paths

Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com>
Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com>
- Replace while-loop port-forward waits with timeout+until one-liner
- Consolidate inline Model Registry steps in full integration test to use model_registry_install.sh and model_registry_test.sh scripts
- Remove stale PID variable references

Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com>
@google-oss-prow google-oss-prow bot added size/XL and removed size/L labels Mar 12, 2026
@juliusvonkohout
Copy link
Copy Markdown
Member

@hbelmiro @Al-Pragliola may you take a look and lgtm is that is how you want model registry and catalog ?
I think this PR also swithces to postgres.
@Raakshass are you 100% sure that there are no new namespaces introduced?
Please provide a difference where we can see which kind of resources are created now in which namespaces.

@Raakshass
Copy link
Copy Markdown
Contributor Author

Hi @juliusvonkohout — here is the architectural change overview you requested. Every claim below is verified by reading the raw upstream YAML files on master (links provided).

1. No New Namespaces — Confirmed

I verified every individual YAML resource file referenced by the 4 new example/kustomization.yaml entries. No Namespace kind resource is created anywhere. No new namespaces are introduced.

How namespace is determined per component:

Component Path Namespace mechanism Target
Model Registry Server + DB overlays/postgres No namespace: in kustomization or resource YAMLs → inherited from -n kubeflow at apply time kubeflow
Istio networking options/istio No namespace: → inherited. Service hosts hardcoded to .kubeflow.svc.cluster.local (proof) kubeflow
UI options/ui/overlays/istio Explicit namespace: kubeflow in kustomization.yaml kubeflow
Model Catalog (demo) options/catalog/overlays/demo No namespace: → inherited from -n kubeflow at apply time kubeflow

2. PSS Restricted Compliance — All Workloads Verified

Since no new namespaces are introduced, the existing kubeflow namespace PSS labels apply. Additionally, I verified that all 5 workloads are PSS restricted-compliant:

Workload seccompProfile runAsNonRoot allowPrivilegeEscalation drop: ALL Source
Deployment/model-registry-db RuntimeDefault false YAML
Deployment/model-registry-deployment RuntimeDefault false YAML
Deployment/model-registry-ui RuntimeDefault false YAML
Deployment/model-catalog-server RuntimeDefault false YAML
StatefulSet/model-catalog-postgres RuntimeDefault ✅ (uid/gid: 70) false YAML

The catalog demo overlay also adds an initContainer (perf-data-init, busybox) which has its own securityContext with allowPrivilegeEscalation: false and drop: ALL.

3. Complete Resource Diff

Master's example/kustomization.yaml: Zero model-registry entries.

This PR adds:

+- ../applications/model-registry/upstream/overlays/postgres
+- ../applications/model-registry/upstream/options/istio
+- ../applications/model-registry/upstream/options/ui/overlays/istio
+- ../applications/model-registry/upstream/options/catalog/overlays/demo

Namespace-scoped resources (all in kubeflow):

Model Registry Server + PostgreSQL:

Kind Name
Deployment model-registry-deployment
Service model-registry-service
ServiceAccount model-registry-server
ConfigMap model-registry-configmap
Deployment model-registry-db
Service model-registry-db
PVC metadata-postgres (10Gi)
ConfigMap model-registry-db-parameters (generated)
Secret model-registry-db-secrets (generated)

Istio Networking:

Kind Name
VirtualService model-registry (prefix: /api/model_registry/, gateway: kubeflow-gateway)
DestinationRule model-registry-service (mTLS)
AuthorizationPolicy model-registry-service (ALLOW all)

UI:

Kind Name
Deployment model-registry-ui
Service model-registry-ui-service (port 80 via patch)
ServiceAccount model-registry-ui
VirtualService model-registry-ui (prefix: /model-registry/)
DestinationRule model-registry-ui (mTLS)
AuthorizationPolicy model-registry-ui (source: istio-ingressgateway-service-account)

Model Catalog (demo):

Kind Name
Deployment model-catalog-server
Service model-catalog
StatefulSet model-catalog-postgres (postgres:17.6)
Service model-catalog-postgres
PVC model-catalog-postgres (5Gi)
ConfigMap model-catalog-sources (generated)
ConfigMap model-catalog-demo-perf-data (generated)
Secret model-catalog-postgres (generated)
Secret model-catalog-hf-api-key (generated)

Cluster-scoped resources (from UI component):

Kind Name Purpose
ClusterRole model-registry-ui-services-reader get/list/watch Services
ClusterRoleBinding model-registry-ui-services-reader-binding Binds above to SA model-registry-ui
ClusterRole model-registry-retrieve-clusterrolebindings get/list/watch ClusterRoleBindings
ClusterRoleBinding model-registry-retrieve-clusterrolebindings-binding Binds above to SA model-registry-ui
ClusterRole model-registry-create-sars create SubjectAccessReviews
ClusterRoleBinding model-registry-create-sars-binding Binds above to SA model-registry-ui

Source: model-registry-ui-role.yaml

Note: The Model Catalog has its own separate PostgreSQL (model-catalog-postgres, StatefulSet, postgres:17.6) distinct from Model Registry's database (model-registry-db, Deployment, postgres:16-alpine).

4. PostgreSQL Switch — Confirmed

This PR switches from overlays/db (MySQL) to overlays/postgres (PostgreSQL):

Master CI (inline) This PR
Overlay path overlays/db overlays/postgres
DB image mysql:8.0 postgres:16-alpine
Server DSN mysql:// via --embedmd-database-dsn postgresql:// via --embedmd-database-type=postgres
Base resources ../../base (identical) ../../base (identical)

5. Central Dashboard Change

Base changed from ../../upstream/overlays/kserve../../upstream/overlays/istio, with a custom configmap patch.

Reason: The upstream kserve overlay extends ../istio and adds a configmap patch with menu items. This PR goes to istio directly and provides its own equivalent configmap patch — identical to the kserve overlay's patch except for one addition:

{
    "icon": "assignment",
    "link": "/model-registry/",
    "text": "Model Registry",
    "type": "item"
}

All existing menu items (Notebooks, TensorBoards, Volumes, Katib, KServe Endpoints, Pipelines) are preserved exactly.

6. CI/Testing Changes

Inline CI steps consolidated into reusable scripts:

  • tests/model_registry_install.sh — installs all 4 components with kubectl apply -n kubeflow, includes kubectl wait with diagnostic output on failure
  • tests/model_registry_test.sh — CRUD tests (RegisteredModel → ModelVersion → ModelArtifact), plus authenticated/unauthorized gateway access tests

Both full_kubeflow_integration_test.yaml and model_registry_test.yaml now call these scripts instead of inline commands.

Signed-off-by: Siddhant Jain siddhantjainofficial26@gmail.com

@abdullahpathan22
Copy link
Copy Markdown

Hello @Raakshass please let me know if you need any help to close this PR!!

@juliusvonkohout
Copy link
Copy Markdown
Member

Hello @Raakshass please let me know if you need any help to close this PR!!

Please test and provide feedback. Now after the 26.03 release we could merge it

@abdullahpathan22
Copy link
Copy Markdown

Yeah Sure!

@abdullahpathan22
Copy link
Copy Markdown

abdullahpathan22 commented Mar 24, 2026

Hello @juliusvonkohout,

Verification Report: PR #3318 — Model Registry Integration

Hi, I have completed a full local testing on this PR. Here is my findings / feedback:-

✅ What Works

  • Deployment: All pods came up Running/Ready in a KinD cluster with manually loaded OCI images.
  • API CRUD: Full Model → Version → Artifact lifecycle works correctly via the REST API.
  • Persistence: Data survives pod restarts and is correctly stored in Postgres (confirmed via conflict detection across multiple runs).
  • Istio Gateway: Model Registry is reachable through the ingress gateway; authorized ServiceAccount tokens return 200 OK.
  • Manifest Quality: All manifests build cleanly with Kustomize 5, PSS Restricted labels and security contexts are active, and the Dashboard sidebar link is correctly injected.

⚠️ Noteworthy Finding

  • Permissive AuthorizationPolicy: The current AuthorizationPolicy is set to ALLOW ALL. This is fine for getting started quickly, but will need to be hardened before this is suitable for secure multi-user environments.

@juliusvonkohout
Copy link
Copy Markdown
Member

⚠️ Noteworthy Finding

* **Permissive AuthorizationPolicy**: The current `AuthorizationPolicy` is set to `ALLOW ALL`. This is fine for getting started quickly, but will need to be hardened before this is suitable for secure multi-user environments.

Can you elaborate a bit and provide links to files ?

@abdullahpathan22
Copy link
Copy Markdown

abdullahpathan22 commented Mar 26, 2026

Hello @juliusvonkohout
Please take a look when were you get time.
Thank you!

While the deployment is functional, the Istio-level security posture is currently permissive (ALLOW ALL), which presents a multi-tenancy risk.

1. Relevant Manifests


2. Technical Analysis of AuthorizationPolicy

In istio-authorization-policy.yaml, the current specification is:

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
spec:
  action: ALLOW
  selector:
    matchLabels:
      component: model-registry-server
  rules:
  - {} # <--- Permissive Wildcard

Security Implication:
In the Istio AuthorizationPolicy spec, the rules field is an array of Rule objects. An empty rule {} is interpreted as a universal match. Because the action is set to ALLOW, this policy explicitly permits all traffic (unauthenticated and unauthorized) to reach the Model Registry server once it passes the ingress gateway.

During my local functional tests, I verified that an unauthorized identity from the default namespace could successfully retrieve metadata from the kubeflow-user-example-com namespace with a 200 OK response, confirming that Istio is not currently enforcing namespace isolation.


3. Multi-Tenancy Routing & VirtualService

The VirtualService correctly defines the ingress path:

http:
- match:
  - uri:
      prefix: /api/model_registry/
  route:
  - destination:
      host: model-registry-service.kubeflow.svc.cluster.local

While the routing prefix is correct for the v0.3.x API, the lack of an accompanying RequestAuthentication resource means the JWT (JSON Web Token) is not being validated at the model-registry-server sidecar.


4. Recommended Hardening Roadmap

To achieve production-grade security for Kubeflow's multi-user environment, we need to implement a Zero-Trust model:

  1. Identity Enforcement: Apply a RequestAuthentication resource to ensure the model-registry-server sidecar rejects any request without a valid JWT from the cluster's OIDC provider.
  2. JWT Claim Validation: Update the AuthorizationPolicy to match the request.auth.claims["namespace"] against the target resource.
  3. Path-Based RBAC: Specialize rules to allow GET operations for the viewer role and restrict POST/PUT/DELETE to the editor role using the to.operation.methods field in the Istio policy.

Example Hardened Spec:

rules:
- from:
  - source:
      requestPrincipals: ["*"] 
  when:
  - key: request.auth.claims[kubeflow-namespace]
    values: ["${target-namespace}"]

…pattern

Replace permissive rules: [{}] with the same dual-path pattern used by

ml-pipeline-ui: (1) allow ingress-gateway traffic (authenticated by

authservice), (2) allow internal K8s JWTs while blocking kubeflow-userid

header spoofing. Add Authorization headers to port-forward CRUD tests

and add negative security tests for unauthenticated access and identity

spoofing.

Ref: applications/pipeline/upstream/base/installs/multi-user/istio-authorization-config.yaml
Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com>
kubectl port-forward creates a direct TCP tunnel to the pod, bypassing

the Istio sidecar proxy. AuthorizationPolicy rules are not enforced on

port-forwarded traffic. Remove Tests 7-8 and unnecessary Authorization

headers from CRUD tests. AP enforcement is validated through gateway

tests (authorized=200, unauthorized=403).

Signed-off-by: Siddhant Jain <siddhantjainofficial26@gmail.com>
@Raakshass
Copy link
Copy Markdown
Contributor Author

Raakshass commented Mar 28, 2026

@juliusvonkohout @abdullahpathan22

Status Update: Security finding addressed. PR is Merge-Ready.

1. Security Hardening: AuthorizationPolicy

  • Flaw: Permissive wildcard allow-all access.
  • Fix: Implemented KFP's dual-path AuthorizationPolicy pattern.
    • Path 1: Allows external traffic authenticated at the gateway via istio-ingressgateway-service-account.
    • Path 2: Allows internal K8s ServiceAccount JWT traffic only if it lacks a kubeflow-userid header, strictly preventing identity spoofing across the mesh.

2. CI Test Modernization

  • Flaw: Tests 7 & 8 attempted to verify HTTP 403 responses via kubectl port-forward.
  • Fix: Removed them. Port-forwarding bypasses the Istio sidecar (Envoy), rendering AuthorizationPolicy rules mathematically untestable via this vector. Security enforcement is now strictly validated via gateway-routed traffic (Test 6).

3. CI Status

  • 6/7 Checks Green (Build, Linting, MR Tests, DCO).
  • 1/7 Failing: Test E2E Integration is failing due to a known, transient GitHub CDN HTTP 502 error when downloading Kustomize across the repo. Pure infrastructure failure; completely unrelated to PR code.

Signed-off-by: Siddhant Jain siddhantjainofficial26@gmail.com

@abdullahpathan22
Copy link
Copy Markdown

Review: Potential Security Improvement — AuthorizationPolicy for Model Registry

Hello @juliusvonkohout / @Raakshass,

I noticed one security gap that we could address before merging.

The Problem

Looking at the resource diff, the current AuthorizationPolicy for model-registry-service is set to ALLOW all:

AuthorizationPolicy: model-registry-service → action: ALLOW all

This means any pod in the cluster can call the Model Registry API freely, with no identity checks. In a multi-tenant Kubeflow deployment this is a real security gap — a rogue pod in any namespace could read or modify another user's registered models.

Potential Solution — Follow KFP's Dual-Path Pattern

KFP already solves this exact problem in manifests/kustomize/base/installs/multi-user/istio-authorization-config.yaml using a dual-path AuthorizationPolicy. We could apply the same pattern to Model Registry:

# For model-registry-service (the API server)
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: model-registry-service
  namespace: kubeflow
spec:
  selector:
    matchLabels:
      app: model-registry
  rules:
  # Path 1: Allow real user traffic authenticated at the Istio gateway
  # (requests coming through the Central Dashboard)
  - from:
    - source:
        principals:
          - cluster.local/ns/istio-system/sa/istio-ingressgateway-service-account

  # Path 2: Allow internal K8s service-to-service traffic
  # Only if it carries a K8s JWT (Authorization header)
  # AND does NOT carry a kubeflow-userid header
  # This prevents any pod from spoofing a user identity across the mesh
  - when:
    - key: request.headers[authorization]
      values:
      - "*"
    - key: request.headers[kubeflow-userid]
      notValues:
      - "*"
# For model-registry-ui (the frontend)
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: model-registry-ui
  namespace: kubeflow
spec:
  selector:
    matchLabels:
      app: model-registry-ui
  rules:
  # Allow only traffic from the Istio ingress gateway
  - from:
    - source:
        principals:
          - cluster.local/ns/istio-system/sa/istio-ingressgateway-service-account

Why This Works

Path Who it allows Why
Path 1 istio-ingressgateway-service-account Real users browsing via Central Dashboard
Path 2 Internal services with K8s JWT, no kubeflow-userid KFP pipelines registering models, internal automation
Blocked Everyone else Any rogue pod, cross-namespace spoofing attempts

This is the same security model KFP uses and would make Model Registry consistent with the rest of the Kubeflow platform.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enable model-registry with UI by default

8 participants