chore: Enable use of local carbide-api instead of mock-core by pbreton · Pull Request #197 · NVIDIA/ncx-infra-controller-rest

pbreton · 2026-03-05T02:20:40Z

Update local stack to connect to local carbide-core
add a test script for validation connection to carbide-api

This depends on PR#449.

copy-pr-bot · 2026-03-05T02:20:43Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

github-actions · 2026-03-05T02:25:49Z

Test Results

8 700 tests ±0 8 700 ✅ ±0 8m 9s ⏱️ - 1m 35s
143 suites ±0 0 💤 ±0
14 files ±0 0 ❌ ±0

Results for commit be29eb2. ± Comparison against base commit 75dc45c.

♻️ This comment has been updated with latest results.

Copilot

Pull request overview

Enables the local kind stack to use a locally running carbide-api (from bare-metal-manager-core) instead of the in-cluster mock-core, and adds a smoke-test script to validate the end-to-end API → core integration path.

Changes:

Add scripts/test-local-core.sh to exercise auth, tenant/site discovery, and resource creation through the local core path.
Update Makefile kind-reset to optionally skip building/deploying mock-core and patch site-agent to point at the local core with TLS and injected certs.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File	Description
scripts/test-local-core.sh	New local integration smoke test script for API → core connectivity and basic workflows.
Makefile	Adds `LOCAL_CORE` mode to connect `site-agent` to a locally running core and manage certs/config patches during `kind-reset`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copilot · 2026-03-05T02:26:08Z

+    local path="$1"
+    local out http_code
+    out=$(curl -s -w "\n%{http_code}" "$BASE_URL$path" \
+        -H "Authorization: Bearer $TOKEN" \
+        -H "Accept: application/json")
+    http_code=$(echo "$out" | tail -n1)
+    body=$(echo "$out" | sed '$d')
+    if [[ "$http_code" -lt 200 || "$http_code" -ge 300 ]]; then
+        die "GET $path → HTTP $http_code: $body"


In api_get, body is assigned without being declared local, which leaks it into the global scope and makes the helper have hidden side effects. Declare local body (and include it in the local ... list) so each call is self-contained and can't accidentally overwrite a global body variable used elsewhere.

Copilot · 2026-03-05T02:26:09Z

+    UPDATE_TAG=$(kubectl exec -n "$PG_NAMESPACE" "$PG_STATEFULSET" -- \
+        psql -U "$PG_USER" -d "$PG_DB" -c \
+        "UPDATE site SET status = 'Registered', updated = NOW() WHERE id = '$SITE_ID';" \
+        2>&1 | grep -E '^UPDATE')


This command substitution relies on grep matching ^UPDATE. With set -euo pipefail, a no-match grep (exit 1) can terminate the script immediately, so the later explicit checks/die message may never run. Make the pipeline non-fatal (e.g., append || true / disable errexit just for this capture) and then validate UPDATE_TAG/exit status explicitly.

Suggested change

2>&1 | grep -E '^UPDATE')

2>&1 | grep -E '^UPDATE' || true)

Copilot · 2026-03-05T02:26:09Z

+# Run a SQL statement inside the in-cluster postgres pod.
+db_exec() {
+    local sql="$1"
+    kubectl exec -n "$PG_NAMESPACE" "$PG_STATEFULSET" -- \
+        psql -U "$PG_USER" -d "$PG_DB" -tAc "$sql"
+}
+


db_exec() is defined but not used anywhere in this script, which adds noise and can mislead readers about how DB access is performed. Remove it, or refactor the later direct kubectl exec ... psql call sites to use this helper consistently.

Suggested change

# Run a SQL statement inside the in-cluster postgres pod.

db_exec() {

local sql="$1"

kubectl exec -n "$PG_NAMESPACE" "$PG_STATEFULSET" -- \

psql -U "$PG_USER" -d "$PG_DB" -tAc "$sql"

}

Copilot · 2026-03-05T02:26:09Z

+	@if [ "$(LOCAL_CORE)" = "true" ]; then \
+		echo "Creating core-grpc-client-site-agent-certs secret from $(LOCAL_CORE_CERTS_DIR)..." ; \
+		for f in ca.crt client.crt client.key; do \
+			[ -f "$(LOCAL_CORE_CERTS_DIR)/$$f" ] || { echo "ERROR: $$f not found in $(LOCAL_CORE_CERTS_DIR). Run gen-certs.sh in bare-metal-manager-core first."; exit 1; } ; \
+		done ; \
+		kubectl -n carbide-rest create secret generic core-grpc-client-site-agent-certs \
+			--from-file=ca.crt="$(LOCAL_CORE_CERTS_DIR)/ca.crt" \
+			--from-file=tls.crt="$(LOCAL_CORE_CERTS_DIR)/client.crt" \
+			--from-file=tls.key="$(LOCAL_CORE_CERTS_DIR)/client.key" \
+			--dry-run=client -o yaml | kubectl apply -f - ; \


When LOCAL_CORE=true, this creates/updates the core-grpc-client-site-agent-certs Secret, but deploy/kustomize/base/site-agent/certificate.yaml also defines a cert-manager Certificate that continuously reconciles the same secret name. That controller will likely overwrite the manually provided local-core certs, causing flapping/connection failures. Consider deleting/omitting the Certificate resource when LOCAL_CORE=true (e.g., kubectl delete certificate core-grpc-client-site-agent-certs) or using a distinct secret name for local-core certs and patching the site-agent deployment to mount that secret instead.

github-actions · 2026-03-06T03:28:19Z

🔐 TruffleHog Secret Scan

✅ No secrets or credentials found!

Your code has been scanned for 700+ types of secrets and credentials. All clear! 🎉

🔗 View scan details

_{🕐 Last updated: 2026-03-06 03:28:18 UTC | Commit: 240cbc1}

github-actions · 2026-03-06T03:28:28Z

🛡️ Vulnerability Scan

🚨 Found 65 vulnerability(ies)
📊 vs main: 65 (no change)

Severity Breakdown:

🔴 Critical/High: 65
🟡 Medium: 0
🔵 Low/Info: 0

🔗 View full details in Security tab

_{🕐 Last updated: 2026-03-19 21:01:44 UTC | Commit: 411c2dc}

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copilot · 2026-03-06T19:37:01Z

+        # Resource already exists; the conflict response carries the existing id
+        # under .data.id — reuse it so the script is idempotent across runs.
+        existing_id=$(echo "$body" | jq -r '.data.id // empty')
+        if [[ -n "$existing_id" ]]; then
+            info "Already exists (HTTP 409) — reusing id: $existing_id"
+            echo "{\"id\":\"$existing_id\"}"
+            return
+        fi
+        die "POST $path → HTTP 409 (conflict) but no id in response: $body"


The script claims idempotency on HTTP 409 by extracting an existing resource id from .data.id, but many API conflict responses use validation errors (e.g., { "data": { "name": "..." } }) and do not include an id. In those cases existing_id will be empty and the script exits, so re-running the smoke test fails. Consider handling 409 by performing a follow-up lookup (e.g., GET list filtered by name/site and extracting .id) rather than assuming .data.id is present.

Suggested change

# Resource already exists; the conflict response carries the existing id

# under .data.id — reuse it so the script is idempotent across runs.

existing_id=$(echo "$body" | jq -r '.data.id // empty')

if [[ -n "$existing_id" ]]; then

info "Already exists (HTTP 409) — reusing id: $existing_id"

echo "{\"id\":\"$existing_id\"}"

return

fi

die "POST $path → HTTP 409 (conflict) but no id in response: $body"

# Resource already exists; the conflict response is expected to carry

# the existing id under .data.id — reuse it so the script is idempotent

# across runs.

existing_id=$(echo "$body" | jq -r '.data.id // empty')

if [[ -n "$existing_id" ]]; then

info "Already exists (HTTP 409) — reusing id from conflict response: $existing_id"

echo "{\"id\":\"$existing_id\"}"

return

fi

# Some APIs return validation-style conflict responses without an id.

# In that case, attempt a follow-up lookup (e.g., by name) to recover

# the existing resource id and keep the script idempotent.

local name lookup_out lookup_code lookup_body lookup_id

name=$(echo "$payload" | jq -r '.name // .data.name // empty')

if [[ -n "$name" ]]; then

info "HTTP 409 without id; attempting lookup by name: \"$name\""

lookup_out=$(curl -s -w "\n%{http_code}" "$BASE_URL$path?name=$name" \

-H "Authorization: Bearer $TOKEN" \

-H "Accept: application/json")

lookup_code=$(echo "$lookup_out" | tail -n1)

lookup_body=$(echo "$lookup_out" | sed '$d')

if [[ "$lookup_code" -ge 200 && "$lookup_code" -lt 300 ]]; then

# Try a few common response shapes to find an id.

lookup_id=$(echo "$lookup_body" | jq -r '.data.id // .data[0].id // .items[0].id // .id // empty')

if [[ -n "$lookup_id" ]]; then

info "Already exists (HTTP 409) — reusing id from lookup: $lookup_id"

echo "{\"id\":\"$lookup_id\"}"

return

fi

fi

fi

die "POST $path → HTTP 409 (conflict) but no id in response or lookup: $body"

Copilot · 2026-03-06T19:37:02Z

+# Override defaults via environment variables:
+#   API_URL, KEYCLOAK_URL, ORG, SITE_NAME, PG_NAMESPACE, PG_STATEFULSET, PG_USER, PG_DB
+#
+
+set -euo pipefail
+
+API_URL="${API_URL:-http://localhost:8388}"
+KEYCLOAK_URL="${KEYCLOAK_URL:-http://localhost:8082}"
+ORG="${ORG:-test-org}"
+SITE_NAME="${SITE_NAME:-local-dev-site}"
+
+# Postgres access via kubectl exec (no local psql or port-forward needed)
+PG_NAMESPACE="${PG_NAMESPACE:-postgres}"
+PG_STATEFULSET="${PG_STATEFULSET:-statefulset/postgres}"
+PG_USER="${PG_USER:-forge}"
+PG_DB="${PG_DB:-forge}"
+
+BASE_URL="$API_URL/v2/org/$ORG/carbide"
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+step() { echo;          echo "──── $* ────"; } >&2
+ok()   { echo "  ✓ $*";                     } >&2
+info() { echo "    $*";                      } >&2
+die()  { echo; echo "ERROR: $*"; exit 1;    } >&2
+
+api_get() {
+    local path="$1"
+    local out http_code
+    out=$(curl -s -w "\n%{http_code}" "$BASE_URL$path" \
+        -H "Authorization: Bearer $TOKEN" \
+        -H "Accept: application/json")
+    http_code=$(echo "$out" | tail -n1)
+    body=$(echo "$out" | sed '$d')
+    if [[ "$http_code" -lt 200 || "$http_code" -ge 300 ]]; then
+        die "GET $path → HTTP $http_code: $body"
+    fi
+    echo "$body"
+}
+
+api_post() {
+    local path="$1"
+    local payload="$2"
+    local out http_code body existing_id
+    out=$(curl -s -w "\n%{http_code}" -X POST "$BASE_URL$path" \
+        -H "Authorization: Bearer $TOKEN" \
+        -H "Content-Type: application/json" \
+        -H "Accept: application/json" \
+        -d "$payload")
+    http_code=$(echo "$out" | tail -n1)
+    body=$(echo "$out" | sed '$d')
+    if [[ "$http_code" == "409" ]]; then
+        # Resource already exists; the conflict response carries the existing id
+        # under .data.id — reuse it so the script is idempotent across runs.
+        existing_id=$(echo "$body" | jq -r '.data.id // empty')
+        if [[ -n "$existing_id" ]]; then
+            info "Already exists (HTTP 409) — reusing id: $existing_id"
+            echo "{\"id\":\"$existing_id\"}"
+            return
+        fi
+        die "POST $path → HTTP 409 (conflict) but no id in response: $body"
+    fi
+    if [[ "$http_code" -lt 200 || "$http_code" -ge 300 ]]; then
+        die "POST $path → HTTP $http_code: $body"
+    fi
+    echo "$body"
+}
+
+# Run a SQL statement inside the in-cluster postgres pod.
+db_exec() {
+    local sql="$1"
+    kubectl exec -n "$PG_NAMESPACE" "$PG_STATEFULSET" -- \
+        psql -U "$PG_USER" -d "$PG_DB" -tAc "$sql"
+}
+
+# ---------------------------------------------------------------------------
+# 1. Authenticate
+# ---------------------------------------------------------------------------
+
+step "Authenticating via Keycloak"
+
+TOKEN=$(curl -sf -X POST \
+    "$KEYCLOAK_URL/realms/carbide-dev/protocol/openid-connect/token" \
+    -H "Content-Type: application/x-www-form-urlencoded" \
+    -d "client_id=carbide-api" \
+    -d "client_secret=carbide-local-secret" \
+    -d "grant_type=password" \
+    -d "username=admin@example.com" \
+    -d "password=adminpassword" \
+    | jq -r '.access_token')


The header says defaults can be overridden via environment variables, but the Keycloak username/password/client_secret are hard-coded below. If different local dev credentials are needed, users have to edit the script. Consider adding env overrides (e.g., KEYCLOAK_USER, KEYCLOAK_PASSWORD, KEYCLOAK_CLIENT_SECRET) and listing them in the "Override defaults" comment block.

thossain-nv · 2026-03-09T19:57:12Z

Should we update README to reflect how to use local core?

coderabbitai · 2026-03-17T20:56:56Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Add optional local Carbide Core support to Makefile and deploy flows (kustomize & helm), enabling site-agent to connect to an external/local core with cert handling and ConfigMap overrides; add an end-to-end smoke test script that authenticates via Keycloak and exercises tenant, site, IP block, allocation, and VPC API flows.

Changes

Cohort / File(s)	Summary
Makefile & deploy targets `Makefile` (targets: `docker-build-local`, `kind-load`, `kind-redeploy`, `kind-reset-infra`, `kind-reset-kustomize`, `kind-reset-helm`, `helm-deploy-site-agent`)	Add `LOCAL_CORE`, `LOCAL_CORE_HOST`, `LOCAL_CORE_PORT`, `LOCAL_CORE_CERTS_DIR`; conditionally skip building/loading/restarting `carbide-rest-mock-core` when `LOCAL_CORE=true`; update rollout/restart logic and completion logging to indicate local vs mock core.
Kustomize / Helm site-agent flow `kind-reset-kustomize`, `kind-reset-infra`, `helm-deploy-site-agent`, `kind-reset-helm`	When `LOCAL_CORE=true`, create/update `core-grpc-client-site-agent-certs` secret from local cert files (fail on missing files), resolve `LOCAL_CORE_HOST` to an IPv4 address via kind control-plane, patch `carbide-rest-site-agent-config` with `CARBIDE_ADDRESS=<resolved-ip>:<PORT>`, `CARBIDE_SEC_OPT=1`, `SKIP_GRPC_SERVER_AUTH=true`, disable Helm chart certificate bootstrapping, restart the site-agent statefulset and wait for rollout.
Local core smoke test script `scripts/test-local-core.sh`	Add new strict bash E2E smoke test: obtain Keycloak token, resolve tenant and site (force site to `Registered` via in-cluster Postgres UPDATE if needed), idempotent POST helper (treat 409 as reuse), create IP block, allocation, and VPC, and print a summary of returned IDs and VPC status.

Sequence Diagram(s)

sequenceDiagram
    participant Tester as Tester (scripts/test-local-core.sh)
    participant Keycloak as Keycloak
    participant API as Carbide API
    participant DB as PostgreSQL
    participant K8s as Kubernetes

    Tester->>Keycloak: POST /auth (credentials) -> token
    Keycloak-->>Tester: bearer token

    Tester->>API: GET /tenant/current (Authorization)
    API->>DB: SELECT tenant
    DB-->>API: tenant data
    API-->>Tester: tenant details

    Tester->>API: GET /site?name=SITE_NAME
    API->>DB: SELECT site
    DB-->>API: site id & status
    API-->>Tester: site info

    alt site.status != "Registered"
        Tester->>K8s: kubectl exec postgres -> UPDATE site SET status='Registered'
        K8s->>DB: execute SQL
        DB-->>K8s: rows affected
        Tester->>API: GET /site/:id (re-check)
        API->>DB: SELECT updated site
        DB-->>API: updated site
        API-->>Tester: confirmed Registered
    end

    Tester->>API: POST /ipblock (body)
    API->>DB: INSERT ipblock
    DB-->>API: ipblock id
    API-->>Tester: ipblock created

    Tester->>API: POST /allocation (body referencing ipblock, site, tenant)
    API->>DB: INSERT allocation
    DB-->>API: allocation id
    API-->>Tester: allocation created

    Tester->>API: POST /vpc (body)
    API->>DB: INSERT vpc
    DB-->>API: vpc id & status
    API-->>Tester: vpc created

    Tester-->>Tester: print summary of tenant/site/ipblock/allocation/vpc ids & vpc status

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐇 I hopped to Keycloak, fetched a token bright,
nibbled certs and patched configs late at night,
poked Postgres till the site said "Registered" clear,
spawned IPs and VPCs — the pathway's now near,
carrots for commits, a joyous debug bite! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 28.57% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: enabling use of local carbide-api instead of mock-core, which is the primary objective across Makefile and script modifications.
Description check	✅ Passed	The description is directly related to the changeset, outlining the two main changes: updating the local stack to connect to local carbide-core and adding a test script for validation.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch chore/local-carbide

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (1)

Makefile (1)
510-534: Extract the LOCAL_CORE site-agent bootstrap into one shared target.

These two recipe blocks are effectively identical, including the hard-coded carbide-rest-local-control-plane lookup. Duplicating the cert checks, host resolution, ConfigMap patch, and rollout logic makes the Helm and Kustomize flows easy to drift apart. A single phony target/define block, using $(KIND_CLUSTER_NAME) for the container name, would keep this maintainable.

Also applies to: 611-634
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 510 - 534, Extract the duplicated LOCAL_CORE bootstrap
logic into a single reusable Makefile target (or a define block) and invoke it
from both places instead of duplicating the recipe; move the cert existence
checks (checking $(LOCAL_CORE_CERTS_DIR)/ca.crt, client.crt, client.key), the
kubectl secret creation for core-grpc-client-site-agent-certs, the CORE_HOST
resolution (currently using the hard-coded container name
carbide-rest-local-control-plane) and the ConfigMap patch/rollout steps into
that shared target and parameterize the container name using
$(KIND_CLUSTER_NAME) (or accept it as a target argument), then replace the
original blocks (the one shown using LOCAL_CORE and the similar block at lines
~611-634) with calls to the new phony target so both Helm and Kustomize flows
use the exact same logic.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@Makefile`:
- Around line 284-286: The kind-apply target fails because KUSTOMIZE_OVERLAY
points to a non-existent composite overlay (deploy/kustomize/overlays/local)
after refactor; create that composite overlay or change kind-apply to apply the
per-component overlays instead. Update the Makefile so kind-apply (and
kind-deploy) either: 1) references a new deploy/kustomize/overlays/local
kustomization that uses resources: site-manager, db, workflow, api, mock-core,
site-agent, or 2) iterates over the same per-component overlays used by
kind-reset-kustomize (site-manager, db, workflow, api, mock-core, site-agent)
and runs kubectl apply -k for each. Also ensure the mock-core inclusion is gated
by LOCAL_CORE exactly like the build/load and kind-reset-kustomize flows (use
the LOCAL_CORE conditional around including or applying mock-core) so the image
build/load and apply steps remain consistent.

In `@scripts/test-local-core.sh`:
- Around line 88-97: The current api_post() handling that treats HTTP 409 as
success makes the smoke test idempotent but allows reruns to skip exercising the
create path; fix by either (A) making resource names unique per run (e.g.,
append a timestamp or UUID when building names for the three POST calls) so POST
always attempts a real create, or (B) change api_post() so a 409 is reported as
"skipped/not-validated" instead of success: detect 409 in the function that
currently extracts existing_id and return a distinct non-success status (or
print a skip message and return a non-zero or special code) so the caller can
treat it as skipped and not declare the smoke test passed; update the logic that
currently prints "Smoke test passed" to fail the run or mark it skipped when
api_post() reports a 409. Ensure you update both the /vpc POST path and the
other POST calls referenced in the same script block (the other POST ranges
noted).
- Around line 173-180: The pipeline that assigns UPDATE_TAG is brittle under set
-euo pipefail because grep returning non-zero will abort; instead capture the
full kubectl exec output into a temporary variable (e.g. UPDATE_OUT) by running
the psql command and redirecting 2>&1, then run the grep on that captured output
(or run grep with a non-fatal fallback) to populate UPDATE_TAG, and finally use
UPDATED_ROWS extraction and existing die checks; update references to UPDATE_TAG
and UPDATED_ROWS accordingly so the script doesn't exit before your explicit
error handling.

---

Nitpick comments:
In `@Makefile`:
- Around line 510-534: Extract the duplicated LOCAL_CORE bootstrap logic into a
single reusable Makefile target (or a define block) and invoke it from both
places instead of duplicating the recipe; move the cert existence checks
(checking $(LOCAL_CORE_CERTS_DIR)/ca.crt, client.crt, client.key), the kubectl
secret creation for core-grpc-client-site-agent-certs, the CORE_HOST resolution
(currently using the hard-coded container name carbide-rest-local-control-plane)
and the ConfigMap patch/rollout steps into that shared target and parameterize
the container name using $(KIND_CLUSTER_NAME) (or accept it as a target
argument), then replace the original blocks (the one shown using LOCAL_CORE and
the similar block at lines ~611-634) with calls to the new phony target so both
Helm and Kustomize flows use the exact same logic.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7bc2f5a3-439c-4199-be1a-7091a86bdfe1

📥 Commits

Reviewing files that changed from the base of the PR and between bc56e32 and 6ed4b25.

📒 Files selected for processing (2)

Makefile
scripts/test-local-core.sh

coderabbitai · 2026-03-17T21:07:13Z

+    if [[ "$http_code" == "409" ]]; then
+        # Resource already exists; the conflict response carries the existing id
+        # under .data.id — reuse it so the script is idempotent across runs.
+        existing_id=$(echo "$body" | jq -r '.data.id // empty')
+        if [[ -n "$existing_id" ]]; then
+            info "Already exists (HTTP 409) — reusing id: $existing_id"
+            echo "{\"id\":\"$existing_id\"}"
+            return
+        fi
+        die "POST $path → HTTP 409 (conflict) but no id in response: $body"


⚠️ Potential issue | 🟠 Major

409 reuse makes reruns stop testing the create path this smoke test is meant to validate.

api_post() treats a conflict as success, and the three POSTs all use fixed names/prefixes. On a rerun, the /vpc call can 409, skip the actual create, and the script still prints "Smoke test passed" even though the carbide-core path was never exercised again. Please either generate per-run resource names or surface 409 as a skipped/not-validated result instead of a pass.

Also applies to: 197-251

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scripts/test-local-core.sh` around lines 88 - 97, The current api_post() handling that treats HTTP 409 as success makes the smoke test idempotent but allows reruns to skip exercising the create path; fix by either (A) making resource names unique per run (e.g., append a timestamp or UUID when building names for the three POST calls) so POST always attempts a real create, or (B) change api_post() so a 409 is reported as "skipped/not-validated" instead of success: detect 409 in the function that currently extracts existing_id and return a distinct non-success status (or print a skip message and return a non-zero or special code) so the caller can treat it as skipped and not declare the smoke test passed; update the logic that currently prints "Smoke test passed" to fail the run or mark it skipped when api_post() reports a 409. Ensure you update both the /vpc POST path and the other POST calls referenced in the same script block (the other POST ranges noted).

coderabbitai · 2026-03-17T21:07:13Z

+    UPDATE_TAG=$(kubectl exec -n "$PG_NAMESPACE" "$PG_STATEFULSET" -- \
+        psql -U "$PG_USER" -d "$PG_DB" -c \
+        "UPDATE site SET status = 'Registered', updated = NOW() WHERE id = '$SITE_ID';" \
+        2>&1 | grep -E '^UPDATE')
+
+    [[ -z "$UPDATE_TAG" ]] && die "DB update produced no UPDATE tag — check site ID and postgres connectivity"
+    UPDATED_ROWS=$(echo "$UPDATE_TAG" | awk '{print $2}')
+    [[ "$UPDATED_ROWS" -eq 0 ]] && die "DB update matched 0 rows — site id '$SITE_ID' not found in table"


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the script and look for the set directive and the problematic code head -20 scripts/test-local-core.sh

Repository: NVIDIA/ncx-infra-controller-rest

Length of output: 853

🏁 Script executed:

# Also look at the specific lines mentioned in the review sed -n '170,185p' scripts/test-local-core.sh

Repository: NVIDIA/ncx-infra-controller-rest

Length of output: 888

🏁 Script executed:

# Check if set -euo pipefail is present in the script grep -n "set -" scripts/test-local-core.sh

Repository: NVIDIA/ncx-infra-controller-rest

Length of output: 96

🏁 Script executed:

# Test grep behavior when pattern not found bash -c "echo 'some output' | grep -E '^UPDATE'; echo \"Exit code: \$?\""

Repository: NVIDIA/ncx-infra-controller-rest

Length of output: 88

🏁 Script executed:

# Test the full scenario with pipefail bash -c "set -euo pipefail; UPDATE_TAG=\$(echo 'some output' | grep -E '^UPDATE'); echo \"UPDATE_TAG=\$UPDATE_TAG\""

Repository: NVIDIA/ncx-infra-controller-rest

Length of output: 58

🏁 Script executed:

# Test what happens with a failing grep in pipefail mode bash -c "set -euo pipefail; UPDATE_TAG=\$(echo 'some output' | grep -E '^UPDATE' || echo 'not found'); echo \"UPDATE_TAG=\$UPDATE_TAG\""

Repository: NVIDIA/ncx-infra-controller-rest

Length of output: 96

The DB-update error handling is unreachable under set -euo pipefail.

The script has set -euo pipefail enabled at line 39. When grep -E '^UPDATE' finds no match, it returns exit code 1. With pipefail active, this causes the entire pipeline assignment to fail, and the script exits immediately before reaching the error check on line 178. Capture the kubectl exec output first and make the grep non-fatal before checking UPDATE_TAG.

Suggested fix

- UPDATE_TAG=$(kubectl exec -n "$PG_NAMESPACE" "$PG_STATEFULSET" -- \ - psql -U "$PG_USER" -d "$PG_DB" -c \ - "UPDATE site SET status = 'Registered', updated = NOW() WHERE id = '$SITE_ID';" \ - 2>&1 | grep -E '^UPDATE') - - [[ -z "$UPDATE_TAG" ]] && die "DB update produced no UPDATE tag — check site ID and postgres connectivity" + UPDATE_OUTPUT=$(kubectl exec -n "$PG_NAMESPACE" "$PG_STATEFULSET" -- \ + psql -U "$PG_USER" -d "$PG_DB" -c \ + "UPDATE site SET status = 'Registered', updated = NOW() WHERE id = '$SITE_ID';" \ + 2>&1) || die "DB update failed — check site ID and postgres connectivity" + UPDATE_TAG=$(printf '%s\n' "$UPDATE_OUTPUT" | grep -E '^UPDATE' || true) + + [[ -z "$UPDATE_TAG" ]] && die "DB update produced no UPDATE tag — check site ID and postgres connectivity"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

UPDATE_TAG=$(kubectl exec -n "$PG_NAMESPACE" "$PG_STATEFULSET" -- \

psql -U "$PG_USER" -d "$PG_DB" -c \

"UPDATE site SET status = 'Registered', updated = NOW() WHERE id = '$SITE_ID';" \

2>&1 | grep -E '^UPDATE')

[[ -z "$UPDATE_TAG" ]] && die "DB update produced no UPDATE tag — check site ID and postgres connectivity"

UPDATED_ROWS=$(echo "$UPDATE_TAG" | awk '{print $2}')

[[ "$UPDATED_ROWS" -eq 0 ]] && die "DB update matched 0 rows — site id '$SITE_ID' not found in table"

UPDATE_OUTPUT=$(kubectl exec -n "$PG_NAMESPACE" "$PG_STATEFULSET" -- \

psql -U "$PG_USER" -d "$PG_DB" -c \

"UPDATE site SET status = 'Registered', updated = NOW() WHERE id = '$SITE_ID';" \

2>&1) || die "DB update failed — check site ID and postgres connectivity"

UPDATE_TAG=$(printf '%s\n' "$UPDATE_OUTPUT" | grep -E '^UPDATE' || true)

[[ -z "$UPDATE_TAG" ]] && die "DB update produced no UPDATE tag — check site ID and postgres connectivity"

UPDATED_ROWS=$(echo "$UPDATE_TAG" | awk '{print $2}')

[[ "$UPDATED_ROWS" -eq 0 ]] && die "DB update matched 0 rows — site id '$SITE_ID' not found in table"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scripts/test-local-core.sh` around lines 173 - 180, The pipeline that assigns UPDATE_TAG is brittle under set -euo pipefail because grep returning non-zero will abort; instead capture the full kubectl exec output into a temporary variable (e.g. UPDATE_OUT) by running the psql command and redirecting 2>&1, then run the grep on that captured output (or run grep with a non-fatal fallback) to populate UPDATE_TAG, and finally use UPDATED_ROWS extraction and existing die checks; update references to UPDATE_TAG and UPDATED_ROWS accordingly so the script doesn't exit before your explicit error handling.

coderabbitai

🧹 Nitpick comments (2)

Makefile (1)

510-534: Extract duplicated LOCAL_CORE setup logic into a reusable target.

The LOCAL_CORE secret creation, IPv4 resolution, ConfigMap patching, and restart logic is duplicated verbatim (~25 lines) between kind-reset-kustomize and helm-deploy-site-agent. This increases maintenance burden and risk of divergence.

♻️ Suggested approach: Extract to a shared target

Create a new target that both can invoke:

# Configure site-agent to connect to local carbide-api (called when LOCAL_CORE=true)
_setup-local-core-site-agent:
	`@echo` "Creating core-grpc-client-site-agent-certs secret from $(LOCAL_CORE_CERTS_DIR)..."
	`@for` f in ca.crt client.crt client.key; do \
		[ -f "$(LOCAL_CORE_CERTS_DIR)/$$f" ] || { echo "ERROR: $$f not found in $(LOCAL_CORE_CERTS_DIR). Run gen-certs.sh in bare-metal-manager-core first."; exit 1; } ; \
	done
	kubectl -n carbide-rest create secret generic core-grpc-client-site-agent-certs \
		--from-file=ca.crt="$(LOCAL_CORE_CERTS_DIR)/ca.crt" \
		--from-file=tls.crt="$(LOCAL_CORE_CERTS_DIR)/client.crt" \
		--from-file=tls.key="$(LOCAL_CORE_CERTS_DIR)/client.key" \
		--dry-run=client -o yaml | kubectl apply -f -
	`@CORE_HOST`="$(LOCAL_CORE_HOST)" ; \
	IPV4=$$(docker exec $$(docker ps -qf name=carbide-rest-local-control-plane) \
		getent ahosts "$$CORE_HOST" 2>/dev/null \
		| awk '/STREAM/{print $$1}' | grep -v ':' | head -1) ; \
	if [ -n "$$IPV4" ]; then \
		echo "Resolved $$CORE_HOST → $$IPV4 (using IPv4 to avoid Go IPv6-first dialer issue)" ; \
		CORE_HOST="$$IPV4" ; \
	fi ; \
	echo "Patching site-agent ConfigMap: CARBIDE_ADDRESS=$$CORE_HOST:$(LOCAL_CORE_PORT), CARBIDE_SEC_OPT=1 (ServerTLS)..." ; \
	kubectl -n carbide-rest patch configmap carbide-rest-site-agent-config --type merge \
		-p "{\"data\":{\"CARBIDE_ADDRESS\":\"$$CORE_HOST:$(LOCAL_CORE_PORT)\",\"CARBIDE_SEC_OPT\":\"1\"}}"
	`@echo` "Restarting site-agent to pick up new CARBIDE_ADDRESS and certs..."
	kubectl -n carbide-rest rollout restart statefulset/carbide-rest-site-agent
	kubectl -n carbide-rest rollout status statefulset/carbide-rest-site-agent --timeout=120s

Then in both targets:

	`@if` [ "$(LOCAL_CORE)" = "true" ]; then \
		$(MAKE) _setup-local-core-site-agent ; \
	fi

Also applies to: 611-634

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 510 - 534, Duplicate LOCAL_CORE setup logic (secret
creation, IPv4 resolution, ConfigMap patch, and restart) should be extracted
into a single reusable make target (e.g., _setup-local-core-site-agent) and
invoked from both kind-reset-kustomize and helm-deploy-site-agent when
LOCAL_CORE=true; create the new target containing the existing sequence (keeping
symbols like LOCAL_CORE_CERTS_DIR, LOCAL_CORE_HOST, LOCAL_CORE_PORT, and kubectl
commands intact), then replace the duplicated inline if-blocks in
kind-reset-kustomize and helm-deploy-site-agent with a conditional that calls
$(MAKE) _setup-local-core-site-agent so both targets share the same
implementation.

scripts/test-local-core.sh (1)

106-110: Remove unused db_exec helper function.

The db_exec function is defined but never called—the DB update at lines 173-176 uses kubectl exec directly with different flags (-c instead of -tAc). Either use this helper or remove it.

♻️ Option 1: Remove unused function

-# Run a SQL statement inside the in-cluster postgres pod.
-db_exec() {
-    local sql="$1"
-    kubectl exec -n "$PG_NAMESPACE" "$PG_STATEFULSET" -- \
-        psql -U "$PG_USER" -d "$PG_DB" -tAc "$sql"
-}
-

♻️ Option 2: Use the helper (with adjusted flags)

 # Run a SQL statement inside the in-cluster postgres pod.
 db_exec() {
     local sql="$1"
     kubectl exec -n "$PG_NAMESPACE" "$PG_STATEFULSET" -- \
-        psql -U "$PG_USER" -d "$PG_DB" -tAc "$sql"
+        psql -U "$PG_USER" -d "$PG_DB" -c "$sql" 2>&1
 }

Then use it at line 173:

-    UPDATE_TAG=$(kubectl exec -n "$PG_NAMESPACE" "$PG_STATEFULSET" -- \
-        psql -U "$PG_USER" -d "$PG_DB" -c \
-        "UPDATE site SET status = 'Registered', updated = NOW() WHERE id = '$SITE_ID';" \
-        2>&1 | grep -E '^UPDATE')
+    UPDATE_OUTPUT=$(db_exec "UPDATE site SET status = 'Registered', updated = NOW() WHERE id = '$SITE_ID';") \
+        || die "DB update failed — check site ID and postgres connectivity"
+    UPDATE_TAG=$(printf '%s\n' "$UPDATE_OUTPUT" | grep -E '^UPDATE' || true)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/test-local-core.sh` around lines 106 - 110, The db_exec helper
(function db_exec) is defined but unused—remove the entire db_exec function
definition to avoid dead code; alternatively, if you prefer to keep it, update
its kubectl flags to accept/include the container flag (e.g., add -c
"$PG_CONTAINER" or a parameter for container) and replace the direct kubectl
exec invocation currently used for the DB update with a call to db_exec so the
helper is actually used.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@Makefile`:
- Around line 510-534: Duplicate LOCAL_CORE setup logic (secret creation, IPv4
resolution, ConfigMap patch, and restart) should be extracted into a single
reusable make target (e.g., _setup-local-core-site-agent) and invoked from both
kind-reset-kustomize and helm-deploy-site-agent when LOCAL_CORE=true; create the
new target containing the existing sequence (keeping symbols like
LOCAL_CORE_CERTS_DIR, LOCAL_CORE_HOST, LOCAL_CORE_PORT, and kubectl commands
intact), then replace the duplicated inline if-blocks in kind-reset-kustomize
and helm-deploy-site-agent with a conditional that calls $(MAKE)
_setup-local-core-site-agent so both targets share the same implementation.

In `@scripts/test-local-core.sh`:
- Around line 106-110: The db_exec helper (function db_exec) is defined but
unused—remove the entire db_exec function definition to avoid dead code;
alternatively, if you prefer to keep it, update its kubectl flags to
accept/include the container flag (e.g., add -c "$PG_CONTAINER" or a parameter
for container) and replace the direct kubectl exec invocation currently used for
the DB update with a call to db_exec so the helper is actually used.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9d5da212-9fc6-46a7-9bdc-3022e96f8784

📥 Commits

Reviewing files that changed from the base of the PR and between 6ed4b25 and 411c2dc.

📒 Files selected for processing (2)

Makefile
scripts/test-local-core.sh

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

Makefile (1)
316-317: ⚠️ Potential issue | 🔴 Critical

deploy/kustomize/overlays/local does not exist and must be created or the kind-apply target will fail.

The Makefile defines KUSTOMIZE_OVERLAY := deploy/kustomize/overlays/local (line 258), but this directory is missing. The repository uses per-component overlays (api, cert-manager, db, mock-core, site-agent, site-manager, workflow) instead of a single local overlay. Either create a deploy/kustomize/overlays/local/kustomization.yaml that references all components, or update the kind-apply target to deploy each overlay individually using the pattern shown in the deployment documentation.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 316 - 317, The kind-apply Makefile target references
KUSTOMIZE_OVERLAY (set to deploy/kustomize/overlays/local) which doesn't exist;
either create deploy/kustomize/overlays/local/kustomization.yaml that aggregates
the per-component overlays (api, cert-manager, db, mock-core, site-agent,
site-manager, workflow) or change the kind-apply target to iterate/apply each
overlay individually (e.g., kubectl apply -k
deploy/kustomize/overlays/<component>) so the Makefile uses the existing
per-component overlays instead of the missing local overlay.

🧹 Nitpick comments (2)

Makefile (1)
611-634: Consider extracting duplicated LOCAL_CORE setup logic into a shared script.

The secret creation, IPv4 resolution, ConfigMap patching, and site-agent restart logic at lines 611-634 is nearly identical to lines 510-534. Extracting this into a shared script (e.g., scripts/setup-local-core.sh) would reduce maintenance burden and ensure consistency.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 611 - 634, Duplicate LOCAL_CORE setup logic (secret
creation, IPv4 resolution, ConfigMap patch, and site-agent restart) should be
extracted into a reusable script; create a script (e.g.,
scripts/setup-local-core.sh) that accepts LOCAL_CORE_CERTS_DIR, LOCAL_CORE_HOST,
LOCAL_CORE_PORT and performs the checks for ca.crt/client.crt/client.key,
kubectl secret creation for core-grpc-client-site-agent-certs, the IPv4
resolution logic that sets CORE_HOST/IPV4, kubectl patch of configmap
carbide-rest-site-agent-config (CARBIDE_ADDRESS/CARBIDE_SEC_OPT), and rollout
restart of statefulset/carbide-rest-site-agent, then replace both Makefile
blocks (the sections guarded by if [ "$(LOCAL_CORE)" = "true" ]) with calls to
that script forwarding the same variables.
scripts/test-local-core.sh (1)
17-37: Consider adding README documentation for LOCAL_CORE usage.

Per the PR comment from thossain-nv, it would be helpful to document how to use LOCAL_CORE=true in the README, including:

Prerequisites (running carbide-api locally, generating certs)

Example command: make kind-reset LOCAL_CORE=true

How to run this smoke test: ./scripts/test-local-core.sh
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/test-local-core.sh` around lines 17 - 37, Add a short README section
documenting LOCAL_CORE usage: describe prerequisites (running carbide-api
locally, generating TLS certs), list relevant environment overrides (API_URL,
KEYCLOAK_URL, ORG, SITE_NAME, PG_NAMESPACE, PG_STATEFULSET, PG_USER, PG_DB),
show the example command to reset kind with LOCAL_CORE enabled (make kind-reset
LOCAL_CORE=true) and show how to run the integration smoke test
(./scripts/test-local-core.sh); place this under the repo README or CONTRIBUTING
and reference the scripts/test-local-core.sh script and the LOCAL_CORE env var
so users can follow the steps end-to-end.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@Makefile`:
- Around line 316-317: The kind-apply Makefile target references
KUSTOMIZE_OVERLAY (set to deploy/kustomize/overlays/local) which doesn't exist;
either create deploy/kustomize/overlays/local/kustomization.yaml that aggregates
the per-component overlays (api, cert-manager, db, mock-core, site-agent,
site-manager, workflow) or change the kind-apply target to iterate/apply each
overlay individually (e.g., kubectl apply -k
deploy/kustomize/overlays/<component>) so the Makefile uses the existing
per-component overlays instead of the missing local overlay.

---

Nitpick comments:
In `@Makefile`:
- Around line 611-634: Duplicate LOCAL_CORE setup logic (secret creation, IPv4
resolution, ConfigMap patch, and site-agent restart) should be extracted into a
reusable script; create a script (e.g., scripts/setup-local-core.sh) that
accepts LOCAL_CORE_CERTS_DIR, LOCAL_CORE_HOST, LOCAL_CORE_PORT and performs the
checks for ca.crt/client.crt/client.key, kubectl secret creation for
core-grpc-client-site-agent-certs, the IPv4 resolution logic that sets
CORE_HOST/IPV4, kubectl patch of configmap carbide-rest-site-agent-config
(CARBIDE_ADDRESS/CARBIDE_SEC_OPT), and rollout restart of
statefulset/carbide-rest-site-agent, then replace both Makefile blocks (the
sections guarded by if [ "$(LOCAL_CORE)" = "true" ]) with calls to that script
forwarding the same variables.

In `@scripts/test-local-core.sh`:
- Around line 17-37: Add a short README section documenting LOCAL_CORE usage:
describe prerequisites (running carbide-api locally, generating TLS certs), list
relevant environment overrides (API_URL, KEYCLOAK_URL, ORG, SITE_NAME,
PG_NAMESPACE, PG_STATEFULSET, PG_USER, PG_DB), show the example command to reset
kind with LOCAL_CORE enabled (make kind-reset LOCAL_CORE=true) and show how to
run the integration smoke test (./scripts/test-local-core.sh); place this under
the repo README or CONTRIBUTING and reference the scripts/test-local-core.sh
script and the LOCAL_CORE env var so users can follow the steps end-to-end.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a276b671-e0a4-4af5-bf3f-3e7150217b99

📥 Commits

Reviewing files that changed from the base of the PR and between 411c2dc and e25cd32.

📒 Files selected for processing (2)

Makefile
scripts/test-local-core.sh

coderabbitai

🧹 Nitpick comments (2)

Makefile (2)

499-503: Redundant mock-core apply.

This block duplicates the mock-core setup already performed in kind-reset-infra (lines 451-455). Since kind-reset-kustomize depends on kind-reset-infra, this apply is redundant (though harmless due to idempotency).

♻️ Suggested removal

-	`@if` [ "$(LOCAL_CORE)" != "true" ]; then \
-		echo "Setting up Carbide Mock Core..." ; \
-		kubectl apply -k deploy/kustomize/overlays/mock-core ; \
-		kubectl -n carbide-rest rollout status deployment/carbide-rest-mock-core --timeout=240s ; \
-	fi
-

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 499 - 503, Remove the redundant mock-core setup in the
conditional that checks LOCAL_CORE by deleting the kubectl apply and rollout
status commands (the "kubectl apply -k deploy/kustomize/overlays/mock-core" and
"kubectl -n carbide-rest rollout status deployment/carbide-rest-mock-core
--timeout=240s" lines) inside the `@if` [ "$(LOCAL_CORE)" != "true" ]; then ... fi
block in the Makefile; this setup is already performed by the kind-reset-infra
target, so keep the LOCAL_CORE guard but remove the duplicated apply/rollout
commands to avoid redundancy.

512-536: Extract duplicated local-core setup into a reusable target or script.

This shell block is duplicated nearly verbatim in helm-deploy-site-agent (lines 615-638). Consider extracting it into a dedicated make target (e.g., setup-local-core-certs) or a helper script to improve maintainability.

Additionally, line 523 hardcodes the control-plane node name. Use the KIND_CLUSTER_NAME variable for consistency:

♻️ Use variable for cluster name

 		CORE_HOST="$(LOCAL_CORE_HOST)" ; \
-		IPV4=$$(docker exec $$(docker ps -qf name=carbide-rest-local-control-plane) \
+		IPV4=$$(docker exec $$(docker ps -qf name=$(KIND_CLUSTER_NAME)-control-plane) \
 			getent ahosts "$$CORE_HOST" 2>/dev/null \

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 512 - 536, The duplicated LOCAL_CORE setup block
(guarded by LOCAL_CORE and using LOCAL_CORE_CERTS_DIR, LOCAL_CORE_HOST,
LOCAL_CORE_PORT) should be extracted into a reusable Makefile target or helper
script (e.g., setup-local-core-certs) and both places (current block and
helm-deploy-site-agent) should call that target to remove duplication; implement
the new target to perform the file checks, kubectl secret creation, IPv4
resolution, ConfigMap patching, and rollout restart/status, and replace the
hardcoded docker filter name "carbide-rest-local-control-plane" with a reference
using KIND_CLUSTER_NAME (e.g., "$$(docker ps -qf
name=$${KIND_CLUSTER_NAME}-control-plane)") so the cluster name variable is used
consistently.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@Makefile`:
- Around line 499-503: Remove the redundant mock-core setup in the conditional
that checks LOCAL_CORE by deleting the kubectl apply and rollout status commands
(the "kubectl apply -k deploy/kustomize/overlays/mock-core" and "kubectl -n
carbide-rest rollout status deployment/carbide-rest-mock-core --timeout=240s"
lines) inside the `@if` [ "$(LOCAL_CORE)" != "true" ]; then ... fi block in the
Makefile; this setup is already performed by the kind-reset-infra target, so
keep the LOCAL_CORE guard but remove the duplicated apply/rollout commands to
avoid redundancy.
- Around line 512-536: The duplicated LOCAL_CORE setup block (guarded by
LOCAL_CORE and using LOCAL_CORE_CERTS_DIR, LOCAL_CORE_HOST, LOCAL_CORE_PORT)
should be extracted into a reusable Makefile target or helper script (e.g.,
setup-local-core-certs) and both places (current block and
helm-deploy-site-agent) should call that target to remove duplication; implement
the new target to perform the file checks, kubectl secret creation, IPv4
resolution, ConfigMap patching, and rollout restart/status, and replace the
hardcoded docker filter name "carbide-rest-local-control-plane" with a reference
using KIND_CLUSTER_NAME (e.g., "$$(docker ps -qf
name=$${KIND_CLUSTER_NAME}-control-plane)") so the cluster name variable is used
consistently.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e004a818-fb00-44e2-838c-a9eee45cc9b9

📥 Commits

Reviewing files that changed from the base of the PR and between e25cd32 and b8163c1.

📒 Files selected for processing (1)

Makefile

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (3)

scripts/test-local-core.sh (2)

172-180: ⚠️ Potential issue | 🟡 Minor

Make the grep non-fatal before the explicit DB-update checks.

With set -euo pipefail on Line 39, a missing ^UPDATE match exits at the assignment on Line 173, so the die checks on Lines 178-180 never run. Capture the full kubectl exec output first, then parse it with grep ... || true.

Suggested fix

-    UPDATE_TAG=$(kubectl exec -n "$PG_NAMESPACE" "$PG_STATEFULSET" -- \
-        psql -U "$PG_USER" -d "$PG_DB" -c \
-        "UPDATE site SET status = 'Registered', updated = NOW() WHERE id = '$SITE_ID';" \
-        2>&1 | grep -E '^UPDATE')
+    UPDATE_OUTPUT=$(kubectl exec -n "$PG_NAMESPACE" "$PG_STATEFULSET" -- \
+        psql -U "$PG_USER" -d "$PG_DB" -c \
+        "UPDATE site SET status = 'Registered', updated = NOW() WHERE id = '$SITE_ID';" \
+        2>&1) || die "DB update failed — check site ID and postgres connectivity"
+    UPDATE_TAG=$(printf '%s\n' "$UPDATE_OUTPUT" | grep -E '^UPDATE' || true)

#!/bin/bash
set -euo pipefail

sed -n '39p;172,180p' scripts/test-local-core.sh
echo
bash -o errexit -o nounset -o pipefail -c \
  'UPDATE_TAG=$(printf "no update tag here\n" | grep -E "^UPDATE"); echo "unreachable"'

Expected: the second command never prints unreachable, confirming the current assignment exits before the explicit error handling.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/test-local-core.sh` around lines 172 - 180, The assignment that pipes
kubectl exec through grep is currently fatal under set -euo pipefail; instead
capture the full kubectl exec output into a variable (e.g., UPDATE_OUTPUT by
running the existing kubectl exec ... 2>&1 and storing its output), then parse
that output with grep into UPDATE_TAG using grep -E '^UPDATE' || true so the
grep failure is non-fatal; keep the existing die checks that test [[ -z
"$UPDATE_TAG" ]] and parse UPDATED_ROWS from UPDATE_TAG as before. Ensure you
reference the same kubectl exec invocation, UPDATE_TAG and UPDATED_ROWS symbols
when making the change.

88-97: ⚠️ Potential issue | 🟠 Major

Don't count HTTP 409 as a validated create-path run.

Line 88 turns conflicts into success, and the POST payloads below reuse fixed resource values. On reruns this can still end in Smoke test passed even though the create path into carbide-core was never exercised again. Either make 409 a skipped/not-validated outcome, or generate per-run resource values that cannot collide.

Also applies to: 197-206, 219-233, 246-251, 265-273

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/test-local-core.sh` around lines 88 - 97, The current HTTP 409
handling in the if block that checks http_code and sets existing_id (the branch
that calls info and echoes the reused id) treats conflicts as successful
validated creates; change this so 409 is not counted as a validated create:
either (A) mark the operation as skipped/failure by replacing the success
echo/info path with a skip log and a non-success return (or call die) so the
smoke test does not consider the create exercised, or (B) make the POST payloads
unique per run by appending a per-run random/timestamp/UUID suffix to the
resource identifiers used where payloads are generated so collisions cannot
occur; apply the same change to the other conflict-handling instances (the
similar blocks at the other ranges referenced) so 409 no longer masquerades as a
validated create.

Makefile (1)

284-286: ⚠️ Potential issue | 🟠 Major

Keep kind-deploy consistent with LOCAL_CORE=true.

These lines skip building the mock-core image, but kind-deploy still ends in kind-apply, which unconditionally applies $(KUSTOMIZE_OVERLAY) on Line 317. If that overlay still includes mock-core—or is still the missing composite overlay—make kind-deploy LOCAL_CORE=true no longer matches the reset flows and can fail or deploy stale resources.

#!/bin/bash
set -euo pipefail

sed -n '279,317p' Makefile
echo

if [ -f deploy/kustomize/overlays/local/kustomization.yaml ]; then
  echo "--- deploy/kustomize/overlays/local/kustomization.yaml ---"
  sed -n '1,200p' deploy/kustomize/overlays/local/kustomization.yaml
elif [ -f deploy/kustomize/overlays/local/kustomization.yml ]; then
  echo "--- deploy/kustomize/overlays/local/kustomization.yml ---"
  sed -n '1,200p' deploy/kustomize/overlays/local/kustomization.yml
else
  echo "deploy/kustomize/overlays/local is missing"
fi

Expected: either the overlay is missing, or its resources show whether mock-core is still applied by kind-apply.

Also applies to: 300-317

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 284 - 286, The kind-deploy flow is inconsistent when
LOCAL_CORE=true: the Docker build for carbide-rest-mock-core is skipped (see the
conditional around the docker build) but kind-apply still unconditionally
applies $(KUSTOMIZE_OVERLAY), which may include the mock-core resources; update
the Makefile so kind-deploy/kind-apply respects LOCAL_CORE by conditionally
excluding or substituting the overlay that contains mock-core (or by adding a
conditional around the apply step), e.g., gate the application of
$(KUSTOMIZE_OVERLAY) or switch to a different overlay when LOCAL_CORE=true so
that mock-core resources are not applied when the image isn't built. Ensure
changes reference the existing variables/targets: LOCAL_CORE, KUSTOMIZE_OVERLAY,
and the kind-apply / kind-deploy targets.

🧹 Nitpick comments (1)

Makefile (1)

512-536: Extract the LOCAL_CORE site-agent bootstrap into one reusable recipe.

This secret creation, host resolution, ConfigMap patching, and rollout logic is duplicated in both deployment paths. Any later fix here now has to land twice, which makes drift likely.

Also applies to: 615-638

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 512 - 536, Extract the duplicated LOCAL_CORE bootstrap
block into a single reusable Makefile recipe (e.g., a phony target like
local-core-site-agent-bootstrap) and replace both inline copies with invocations
of that recipe; move the logic that checks LOCAL_CORE, validates files in
$(LOCAL_CORE_CERTS_DIR), creates the core-grpc-client-site-agent-certs secret,
resolves CORE_HOST/IPV4, patches the carbide-rest-site-agent-config ConfigMap
and performs kubectl rollout restart/status into the new recipe, parameterize it
via the existing variables (LOCAL_CORE, LOCAL_CORE_CERTS_DIR, LOCAL_CORE_HOST,
LOCAL_CORE_PORT) and ensure the original callers (the blocks at the current
location and the duplicate at lines 615-638) simply call the new target to avoid
code drift.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@Makefile`:
- Around line 523-529: The Makefile logic that computes IPV4 is hardcoded to the
container name "carbide-rest-local-control-plane" instead of deriving it from
the existing KIND_CLUSTER_NAME, so running with make KIND_CLUSTER_NAME=...
LOCAL_CORE=true can target the wrong node; update the docker exec invocation and
any other places that reference "carbide-rest-local-control-plane" (also at the
later block around the same logic) to compute the container name from
$(KIND_CLUSTER_NAME) (e.g., use the same templated container name construction
you use elsewhere) so CORE_HOST resolution uses the correct container, and
ensure the variable CORE_HOST assignment and echo message still reflect the
substituted value.
- Around line 607-612: In the Makefile target helm-deploy-site-agent, remove the
trailing "|| true" from the helm upgrade --install carbide-rest-site-agent ...
--timeout 1m command so that template rendering, validation, and hook errors are
not suppressed; optionally consider adding --wait and increasing the --timeout
value if you want Helm to wait for resources to become ready, but do not swallow
failures by appending "|| true".

---

Duplicate comments:
In `@Makefile`:
- Around line 284-286: The kind-deploy flow is inconsistent when
LOCAL_CORE=true: the Docker build for carbide-rest-mock-core is skipped (see the
conditional around the docker build) but kind-apply still unconditionally
applies $(KUSTOMIZE_OVERLAY), which may include the mock-core resources; update
the Makefile so kind-deploy/kind-apply respects LOCAL_CORE by conditionally
excluding or substituting the overlay that contains mock-core (or by adding a
conditional around the apply step), e.g., gate the application of
$(KUSTOMIZE_OVERLAY) or switch to a different overlay when LOCAL_CORE=true so
that mock-core resources are not applied when the image isn't built. Ensure
changes reference the existing variables/targets: LOCAL_CORE, KUSTOMIZE_OVERLAY,
and the kind-apply / kind-deploy targets.

In `@scripts/test-local-core.sh`:
- Around line 172-180: The assignment that pipes kubectl exec through grep is
currently fatal under set -euo pipefail; instead capture the full kubectl exec
output into a variable (e.g., UPDATE_OUTPUT by running the existing kubectl exec
... 2>&1 and storing its output), then parse that output with grep into
UPDATE_TAG using grep -E '^UPDATE' || true so the grep failure is non-fatal;
keep the existing die checks that test [[ -z "$UPDATE_TAG" ]] and parse
UPDATED_ROWS from UPDATE_TAG as before. Ensure you reference the same kubectl
exec invocation, UPDATE_TAG and UPDATED_ROWS symbols when making the change.
- Around line 88-97: The current HTTP 409 handling in the if block that checks
http_code and sets existing_id (the branch that calls info and echoes the reused
id) treats conflicts as successful validated creates; change this so 409 is not
counted as a validated create: either (A) mark the operation as skipped/failure
by replacing the success echo/info path with a skip log and a non-success return
(or call die) so the smoke test does not consider the create exercised, or (B)
make the POST payloads unique per run by appending a per-run
random/timestamp/UUID suffix to the resource identifiers used where payloads are
generated so collisions cannot occur; apply the same change to the other
conflict-handling instances (the similar blocks at the other ranges referenced)
so 409 no longer masquerades as a validated create.

---

Nitpick comments:
In `@Makefile`:
- Around line 512-536: Extract the duplicated LOCAL_CORE bootstrap block into a
single reusable Makefile recipe (e.g., a phony target like
local-core-site-agent-bootstrap) and replace both inline copies with invocations
of that recipe; move the logic that checks LOCAL_CORE, validates files in
$(LOCAL_CORE_CERTS_DIR), creates the core-grpc-client-site-agent-certs secret,
resolves CORE_HOST/IPV4, patches the carbide-rest-site-agent-config ConfigMap
and performs kubectl rollout restart/status into the new recipe, parameterize it
via the existing variables (LOCAL_CORE, LOCAL_CORE_CERTS_DIR, LOCAL_CORE_HOST,
LOCAL_CORE_PORT) and ensure the original callers (the blocks at the current
location and the duplicate at lines 615-638) simply call the new target to avoid
code drift.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a154a0b2-eb5d-4cf5-9010-0bd9556cd8f6

📥 Commits

Reviewing files that changed from the base of the PR and between b8163c1 and baa3c8c.

📒 Files selected for processing (2)

Makefile
scripts/test-local-core.sh

coderabbitai · 2026-03-25T17:39:49Z

+		IPV4=$$(docker exec $$(docker ps -qf name=carbide-rest-local-control-plane) \
+			getent ahosts "$$CORE_HOST" 2>/dev/null \
+			| awk '/STREAM/{print $$1}' | grep -v ':' | head -1) ; \
+		if [ -n "$$IPV4" ]; then \
+			echo "Resolved $$CORE_HOST → $$IPV4 (using IPv4 to avoid Go IPv6-first dialer issue)" ; \
+			CORE_HOST="$$IPV4" ; \
+		fi ; \


⚠️ Potential issue | 🟡 Minor

Derive the control-plane container from KIND_CLUSTER_NAME.

Both LOCAL_CORE flows hardcode carbide-rest-local-control-plane instead of using the existing $(KIND_CLUSTER_NAME) variable. make KIND_CLUSTER_NAME=... ... LOCAL_CORE=true will resolve the wrong node and silently skip the IPv4 fallback.

Suggested fix

- IPV4=$$(docker exec $$(docker ps -qf name=carbide-rest-local-control-plane) \ + IPV4=$$(docker exec $$(docker ps -qf name=$(KIND_CLUSTER_NAME)-control-plane) \ getent ahosts "$$CORE_HOST" 2>/dev/null \ | awk '/STREAM/{print $$1}' | grep -v ':' | head -1) ; \

Also applies to: 626-632

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@Makefile` around lines 523 - 529, The Makefile logic that computes IPV4 is hardcoded to the container name "carbide-rest-local-control-plane" instead of deriving it from the existing KIND_CLUSTER_NAME, so running with make KIND_CLUSTER_NAME=... LOCAL_CORE=true can target the wrong node; update the docker exec invocation and any other places that reference "carbide-rest-local-control-plane" (also at the later block around the same logic) to compute the container name from $(KIND_CLUSTER_NAME) (e.g., use the same templated container name construction you use elsewhere) so CORE_HOST resolution uses the correct container, and ensure the variable CORE_HOST assignment and echo message still reflect the substituted value.

coderabbitai · 2026-03-25T17:39:49Z

 helm-deploy-site-agent:
 	@echo "Installing site-agent chart (will CrashLoop until bootstrapped)..."
 	helm upgrade --install carbide-rest-site-agent $(SITE_AGENT_CHART)/ \
-		--namespace carbide-rest $(HELM_SET) --timeout 1m || true
+		--namespace carbide-rest $(HELM_SET) \
+		$(if $(filter true,$(LOCAL_CORE)),--set certificate.enabled=false) \
+		--timeout 1m || true


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

For Helm CLI, does helm upgrade --installfail when a workload enters CrashLoopBackOff if--waitis not set, and does--timeoutaffect upgrade/install behavior without--waitor--atomic?

💡 Result:

No, helm upgrade --install does not fail when a workload enters CrashLoopBackOff if --wait is not set. Yes, --timeout affects upgrade/install behavior without --wait or --atomic for individual Kubernetes operations like hooks.

Citations:

1: https://v3.helm.sh/docs/helm/helm_upgrade/

2: https://v3.helm.sh/docs/intro/using_helm/

3: https://github.com/helm/helm/blob/main/pkg/action/upgrade.go

4: https://stackoverflow.com/questions/71417105/how-to-handle-helm-upgrade-timeout-error

🌐 Web query:

helm upgrade --install failure modes errors without --wait template rendering validation

💡 Result:

Helm upgrade --install fails without --wait primarily due to post-install hook failures, especially when hooks depend on webhooks or resources that aren't immediately ready. Template rendering and validation errors cause failures before hooks. Key failure modes: 1. Post-install hook failures: Without --wait, Helm doesn't wait for main resources (Deployments, etc.) to be ready before running post-install hooks. Hooks like Jobs or webhooks fail if dependencies (e.g., webhook services) aren't available, causing UPGRADE FAILED: failed post-install. Adding --wait makes Helm wait for resources ready, then run hooks successfully. 2. Template rendering errors: Invalid YAML, nil pointer errors (missing values), type mismatches, undefined functions. Use helm template --debug, helm lint --strict to validate before upgrade. 3. Kubernetes validation errors: Rendered manifests fail schema validation, immutable field changes, unknown fields. Validate with helm template | kubeval --strict. 4. Timeout errors without --wait: Default 5m timeout hit if resources slow to ready; command fails even if eventually succeeds. 5. Resource conflicts: Stuck failed releases from prior attempts block upgrades. Without --wait, upgrade proceeds immediately after posting resources but marks failed if hooks fail post-posting. Official docs recommend --wait for production to ensure readiness. Debug: helm upgrade --dry-run --debug; helm template --debug; check release status with helm status. Prevention: Validate charts (lint, template, kubeval), use --wait/--timeout=15m, --atomic for rollback.

Citations:

1: Installing the helm chart without the --wait parameter fails kubernetes-sigs/cluster-api-operator#406

2: https://helm.sh/docs/helm/helm_upgrade/

3: https://v3.helm.sh/docs/intro/using_helm/

4: https://helm.sh/docs/chart_template_guide/debugging

5: helm upgrade --wait does not wait on newer versions helm/helm#10061

6: Helm install/upgrade wait robustness helm/helm#11789

7: Expected behavior of "helm upgrade --atomic --wait --debug --timeout 300 --version 1.1.0" helm/helm#5626

8: helm upgrade --wait-for-jobs does not detect a job error in certain situations helm/helm#13316

9: Helm upgrades using --wait succeed without readiness checks passing if you downscale replicas to 0 helm/helm#12768

10: Helm Hook "Post Install" not working for Dependencies Chart helm/helm#11422

11: https://stackoverflow.com/questions/68643009/how-do-i-make-helm-chart-hooks-post-install-work-if-other-charts-are-in-running

12: https://stackoverflow.com/questions/62156848/is-there-any-option-to-tell-to-helm-to-not-wait-till-post-install-hook-is-comple

13: post-install hooks are not executed after resources were posted fluxcd/helm-controller#249

14: Helm post-install hooks are not triggered when used by helm_release and wait = true hashicorp/terraform-provider-helm#683

Remove || true to surface template and validation errors.

helm upgrade --install without --wait returns after submitting resources (so CrashLoop won't cause helm to fail), but it still fails on template rendering errors, Kubernetes validation failures, and post-install hook issues. The || true suppresses these legitimate errors and allows the bootstrap to continue with a broken release.

Suggested fix

helm upgrade --install carbide-rest-site-agent $(SITE_AGENT_CHART)/ \ --namespace carbide-rest $(HELM_SET) \ $(if $(filter true,$(LOCAL_CORE)),--set certificate.enabled=false) \ - --timeout 1m || true + --timeout 1m

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@Makefile` around lines 607 - 612, In the Makefile target helm-deploy-site-agent, remove the trailing "|| true" from the helm upgrade --install carbide-rest-site-agent ... --timeout 1m command so that template rendering, validation, and hook errors are not suppressed; optionally consider adding --wait and increasing the --timeout value if you want Helm to wait for resources to become ready, but do not swallow failures by appending "|| true".

…test script for validation Signed-off-by: Patrice Breton <pbreton@nvidia.com>

Copilot AI review requested due to automatic review settings March 5, 2026 02:20

Copilot started reviewing on behalf of pbreton March 5, 2026 02:21 View session

Copilot AI reviewed Mar 5, 2026

View reviewed changes

pbreton changed the title ~~chore: enable to use local carbide-api instead of mock-core~~ chore: Enable use of local carbide-api instead of mock-core Mar 6, 2026

pbreton force-pushed the chore/local-carbide branch from c8a1846 to 240cbc1 Compare March 6, 2026 03:27

Copilot AI review requested due to automatic review settings March 6, 2026 19:31

pbreton force-pushed the chore/local-carbide branch from 240cbc1 to d12e81a Compare March 6, 2026 19:31

Copilot started reviewing on behalf of pbreton March 6, 2026 19:32 View session

Copilot AI reviewed Mar 6, 2026

View reviewed changes

pbreton force-pushed the chore/local-carbide branch from d12e81a to 18e8f6c Compare March 9, 2026 18:54

pbreton force-pushed the chore/local-carbide branch from 18e8f6c to 6ed4b25 Compare March 17, 2026 20:56

coderabbitai Bot reviewed Mar 17, 2026

View reviewed changes

pbreton force-pushed the chore/local-carbide branch from 6ed4b25 to 411c2dc Compare March 19, 2026 21:00

coderabbitai Bot reviewed Mar 19, 2026

View reviewed changes

pbreton force-pushed the chore/local-carbide branch from 411c2dc to e25cd32 Compare March 25, 2026 01:15

coderabbitai Bot reviewed Mar 25, 2026

View reviewed changes

pbreton force-pushed the chore/local-carbide branch from b8163c1 to baa3c8c Compare March 25, 2026 17:29

coderabbitai Bot reviewed Mar 25, 2026

View reviewed changes

pbreton force-pushed the chore/local-carbide branch 2 times, most recently from 7917324 to 330f61b Compare April 3, 2026 16:57

pbreton requested a review from a team as a code owner April 3, 2026 16:57

pbreton force-pushed the chore/local-carbide branch 2 times, most recently from ed3a541 to ba8360e Compare April 9, 2026 16:30

pbreton force-pushed the chore/local-carbide branch 3 times, most recently from 9fe2bc0 to d75f362 Compare April 15, 2026 23:57

pbreton force-pushed the chore/local-carbide branch 3 times, most recently from a3de0d6 to f355215 Compare April 21, 2026 22:31

chore: Update local stack to connect to local carbide-core and add a …

be29eb2

…test script for validation Signed-off-by: Patrice Breton <pbreton@nvidia.com>

pbreton force-pushed the chore/local-carbide branch from f355215 to be29eb2 Compare April 21, 2026 22:57

-        # Resource already exists; the conflict response carries the existing id
-        # under .data.id — reuse it so the script is idempotent across runs.
-        existing_id=$(echo "$body" | jq -r '.data.id // empty')
-        if [[ -n "$existing_id" ]]; then
-            info "Already exists (HTTP 409) — reusing id: $existing_id"
-            echo "{\"id\":\"$existing_id\"}"
-            return
-        fi
-        die "POST $path → HTTP 409 (conflict) but no id in response: $body"
+        # Resource already exists; the conflict response is expected to carry
+        # the existing id under .data.id — reuse it so the script is idempotent
+        # across runs.
+        existing_id=$(echo "$body" | jq -r '.data.id // empty')
+        if [[ -n "$existing_id" ]]; then
+            info "Already exists (HTTP 409) — reusing id from conflict response: $existing_id"
+            echo "{\"id\":\"$existing_id\"}"
+            return
+        fi
+        # Some APIs return validation-style conflict responses without an id.
+        # In that case, attempt a follow-up lookup (e.g., by name) to recover
+        # the existing resource id and keep the script idempotent.
+        local name lookup_out lookup_code lookup_body lookup_id
+        name=$(echo "$payload" | jq -r '.name // .data.name // empty')
+        if [[ -n "$name" ]]; then
+            info "HTTP 409 without id; attempting lookup by name: \"$name\""
+            lookup_out=$(curl -s -w "\n%{http_code}" "$BASE_URL$path?name=$name" \
+                -H "Authorization: Bearer $TOKEN" \
+                -H "Accept: application/json")
+            lookup_code=$(echo "$lookup_out" | tail -n1)
+            lookup_body=$(echo "$lookup_out" | sed '$d')
+            if [[ "$lookup_code" -ge 200 && "$lookup_code" -lt 300 ]]; then
+                # Try a few common response shapes to find an id.
+                lookup_id=$(echo "$lookup_body" | jq -r '.data.id // .data[0].id // .items[0].id // .id // empty')
+                if [[ -n "$lookup_id" ]]; then
+                    info "Already exists (HTTP 409) — reusing id from lookup: $lookup_id"
+                    echo "{\"id\":\"$lookup_id\"}"
+                    return
+                fi
+            fi
+        fi
+        die "POST $path → HTTP 409 (conflict) but no id in response or lookup: $body"

Conversation

pbreton commented Mar 5, 2026

Uh oh!

copy-pr-bot Bot commented Mar 5, 2026

Uh oh!

github-actions Bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Mar 6, 2026

🔐 TruffleHog Secret Scan

Uh oh!

github-actions Bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🛡️ Vulnerability Scan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

thossain-nv commented Mar 9, 2026

Uh oh!

coderabbitai Bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Mar 5, 2026 •

edited

Loading

github-actions Bot commented Mar 6, 2026 •

edited

Loading

coderabbitai Bot commented Mar 17, 2026 •

edited

Loading