fix: prevent false removed connectors in diff by stripping augmentation fields#185
fix: prevent false removed connectors in diff by stripping augmentation fields#185JakeSCahill wants to merge 2 commits intomainfrom
Conversation
✅ Deploy Preview for docs-extensions-and-macros ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThe change adds a preprocessing step to Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tools/redpanda-connect/report-delta.js (1)
11-17: Reuse the cleaned indexes inprintDeltaReport()too.
generateConnectorDiffJson()now ignores augmentation-only connectors, butprintDeltaReport()still builds its maps from the raw indexes at Lines 427-428. That can make the console summary disagree with the JSON diff for the same input.♻️ Align the console delta path with the JSON diff path
function printDeltaReport(oldIndex, newIndex) { - const oldMap = buildComponentMap(oldIndex); - const newMap = buildComponentMap(newIndex); + const oldMap = buildComponentMap(stripAugmentationFields(oldIndex)); + const newMap = buildComponentMap(stripAugmentationFields(newIndex));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/redpanda-connect/report-delta.js` around lines 11 - 17, printDeltaReport() is still building maps from the raw indexes (so its console summary can diverge from generateConnectorDiffJson()); update printDeltaReport() to use the same cleaned indexes by calling stripAugmentationFields() (or accepting oldIndexClean/newIndexClean) and then passing those cleaned data to buildComponentMap() instead of the raw oldIndex/newIndex so the console summary aligns with the JSON diff.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tools/redpanda-connect/report-delta.js`:
- Around line 11-17: printDeltaReport() is still building maps from the raw
indexes (so its console summary can diverge from generateConnectorDiffJson());
update printDeltaReport() to use the same cleaned indexes by calling
stripAugmentationFields() (or accepting oldIndexClean/newIndexClean) and then
passing those cleaned data to buildComponentMap() instead of the raw
oldIndex/newIndex so the console summary aligns with the JSON diff.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0675760b-9ac0-433e-9156-cebafb8c24a4
📒 Files selected for processing (1)
tools/redpanda-connect/report-delta.js
PROBLEM: The previous fix (PR #185) attempted to prevent false "removed connectors" by stripping augmentation fields before diff. However, the filter logic was flawed: - CGO connectors WITH config were kept, still appearing as "removed" - New CGO connectors couldn't be detected - Platform transitions (CGO→OSS, cloud support changes) were invisible - Binary analysis tracked platform changes separately from main diff ROOT CAUSE: Splitting tracking into two systems (OSS-only diff + separate binary analysis) lost critical information about platform transitions and made CGO/cloud connector changes hard to track. SOLUTION: Replace stripping with unified diff that includes platform metadata: 1. Remove stripAugmentationFields from diff comparison 2. buildComponentMap now preserves platform metadata: - requiresCgo - cloudSupported - cloudOnly 3. Detect platform transitions for existing connectors: - became_cgo_only / no_longer_cgo_only - added_cloud_support / removed_cloud_support - became_cloud_only / no_longer_cloud_only 4. Include platform metadata in new/removed connector reports 5. For binary analysis input only: - Create clean OSS data by filtering connectors with config/fields - Remove platform metadata before analysis - This ensures binary analyzer compares actual binaries, not metadata BENEFITS: ✅ Single source of truth for all connector changes ✅ Detects new CGO/cloud connectors correctly ✅ Shows platform transitions (e.g., tigerbeetle_cdc: CGO→OSS) ✅ No false "removed" reports ✅ Diff includes full context for documentation updates BREAKING CHANGES: - Diff JSON now includes platform metadata on connectors - New platformTransitions array in diff.details - summary.platformTransitions count added TESTING: - All existing tests pass - Manual test confirms: • New CGO connectors detected • No false removals • Platform transitions tracked correctly Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
PROBLEM: The unified diff works perfectly when binary analysis succeeds on both versions. However, when binary analysis fails (auth issues, network problems), we compare: - Old: Augmented data with CGO/cloud connectors - New: Non-augmented OSS-only data Result: False "removed" reports for CGO/cloud connectors SOLUTION: Add fallback logic that detects when binary analysis is unavailable and strips CGO/cloud connectors from old data before comparison: 1. Check if binaryAnalysis.ossVersion exists 2. If not, strip connectors marked with requiresCgo or cloudOnly from old data 3. Remove platform metadata (requiresCgo, cloudSupported, cloudOnly) from remaining 4. Compare OSS-only vs OSS-only data BEHAVIOR: - With binary analysis: Full unified diff with platform transitions ✅ - Without binary analysis: OSS-only comparison, no false removals ✅ TESTING (4.85.0 → 4.86.0): Before fix: - Binary analysis failed - 5 false "removed" connectors (tigerbeetle_cdc, zmq4, ffi, a2a_message) After fix: - Binary analysis failed - Stripped 5 CGO/cloud connectors from old data - 0 removed connectors (no false positives!) ✅ - 0 new connectors (string_split filtered as already documented) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
85745af to
de679f9
Compare
🔄 Major Update: Complete Rework of the FixThe original approach had a critical flaw in the filter logic. I've completely reworked the solution. What ChangedOriginal approach (commit
New approach (commits
Testing Results (4.85.0 → 4.86.0)Before this fix: After this fix: New Behavior
Breaking Changes
|
Problem
Connector diff generation was incorrectly reporting CGO-only and cloud-only connectors as "removed" when comparing versions.
Real-world impact: PR #406 incorrectly showed 5 connectors as removed when diffing 4.85.0 → 4.86.0.
Before Fix (Incorrect - 5 False Removals)
After Fix (Correct - 0 False Removals)
Why those 5 connectors appeared "removed":
Actual changes 4.85.0 → 4.86.0:
Root Cause
rpk connect listdoesn't include CGO-only or cloud-only connectorsrequiresCgo: truefor CGO-only connectors (tigerbeetle_cdc, zmq4, ffi)cloudOnly: truefor cloud-only connectors (a2a_message)cloudSupported: true/falsefor availability trackingrpk connect list- missing CGO/cloud-only connectorsProof that CGO connectors aren't in OSS rpk:
Solution
Added
stripAugmentationFields()function that cleans data before diff comparison:Filter logic:
configorfields(proves they came from rpk connect list)requiresCgo/cloudOnly(added by binary analysis only)Applied to: Both old and new index data before building component maps for comparison
This ensures the diff only compares connectors that actually came from
rpk connect list, preventing CGO-only and cloud-only connectors from appearing as "removed".Code Changes
New function added:
Modified diff function:
Testing Results
Tested on PR #406 (4.85.0 → 4.86.0):
Before Fix ❌
{ "summary": { "newComponents": 1, "removedComponents": 5 }, "details": { "newComponents": [ { "name": "string_split", "type": "processors", "status": "stable" } ], "removedComponents": [ { "name": "tigerbeetle_cdc", "type": "inputs" }, { "name": "zmq4", "type": "inputs" }, { "name": "zmq4", "type": "outputs" }, { "name": "ffi", "type": "processors" }, { "name": "a2a_message", "type": "processors" } ] } }After Fix ✅
{ "summary": { "newComponents": 1, "removedComponents": 0 }, "details": { "newComponents": [ { "name": "string_split", "type": "processors", "status": "stable" } ], "removedComponents": [] } }Test command:
cd /path/to/rp-connect-docs npx doc-tools generate rpcn-connector-docs \ --old-data docs-data/connect-4.85.0.json \ --data-dir docs-data \ --cloud-version 4.86.0 \ --cgo-version 4.86.0Impact
✅ Fixes:
✅ Preserves:
✅ Correct Behavior:
Related
Co-Authored-By: Claude Sonnet 4.5 noreply@anthropic.com