Skip to content

t245.2: Codacy API integration — fetch, normalize, deduplicate#1000

Merged
marcusquinn merged 1 commit intomainfrom
feature/t245.2
Feb 10, 2026
Merged

t245.2: Codacy API integration — fetch, normalize, deduplicate#1000
marcusquinn merged 1 commit intomainfrom
feature/t245.2

Conversation

@marcusquinn
Copy link
Copy Markdown
Owner

@marcusquinn marcusquinn commented Feb 10, 2026

Summary

  • Implements Codacy v3 API integration in quality-sweep-helper.sh, following the exact same pattern as the existing SonarCloud integration (t245.1)
  • Adds cross-source deduplication query to identify findings that appear in both SonarCloud and Codacy (same file+line+type)
  • All findings normalize into the shared findings SQLite table with source='codacy'

What's Added

Codacy Functions

  • load_codacy_token() -- 3-tier credential loading (env CODACY_API_TOKEN, gopass, credentials.sh)
  • codacy_api_call() -- Authenticated API wrapper for Codacy v3 (api-token header auth)
  • fetch_codacy_issues() -- Cursor-based paginated fetch with severity/type normalization
  • cmd_codacy_query() -- Query stored Codacy findings with filters (severity, file, rule, status)
  • cmd_codacy_summary() -- Breakdown by severity/type/rule/file
  • cmd_codacy_export() -- Export as JSON or CSV
  • cmd_codacy_status() -- Show config, auth, and database status
  • cmd_codacy() -- Subcommand router (fetch/query/summary/export/status/dedup/help)

Cross-Source Deduplication

  • cmd_dedup() -- Query-time dedup that groups findings by file+line+type across all sources
  • Available as both quality-sweep-helper.sh dedup (top-level) and quality-sweep-helper.sh codacy dedup
  • Reports worst severity, all contributing sources, and all rule IDs for each duplicate

Normalization Mapping

Codacy Severity Error -> high, Warning -> medium, Info -> info

Codacy Category Security -> VULNERABILITY, ErrorProne -> BUG, CodeStyle/Performance/UnusedCode/etc -> CODE_SMELL

Verification

  • bash -n syntax check: PASS
  • ShellCheck (-x -S warning): zero violations
  • help output: verified for both top-level and codacy subcommand
  • codacy status: runs correctly (shows deps, auth status, DB state)
  • dedup: runs correctly (reports no dupes when DB is empty)

Task

Closes t245.2

Summary by CodeRabbit

  • New Features
    • Codacy integration with fetch, query, summary, export, and status commands for code quality analysis.
    • Cross-source duplicate findings detection to identify matching issues across scanning tools.
    • Enhanced CLI help documentation for new Codacy and dedup capabilities.

…5.2)

Implements Codacy v3 API integration in quality-sweep-helper.sh:
- load_codacy_token(): 3-tier credential loading (env/gopass/credentials.sh)
- codacy_api_call(): authenticated API wrapper with api-token header auth
- fetch_codacy_issues(): cursor-based pagination, severity/type normalization
- cmd_codacy_query/summary/export/status: mirror SonarCloud pattern
- cmd_dedup(): cross-source deduplication (file+line+type across tools)
- Wired into main() command router, updated help text

Codacy severity mapping: Error->high, Warning->medium, Info->info
Codacy type mapping: Security->VULNERABILITY, ErrorProne->BUG, others->CODE_SMELL
Dedup is query-time only — does not modify stored findings.
@gemini-code-assist
Copy link
Copy Markdown

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 10, 2026

Walkthrough

This PR adds full Codacy integration to quality-sweep-helper.sh with credential management, API communication, paginated issue fetching with SQL insertion, and command subcommands for fetch, query, summary, export, and status operations. It also introduces cross-source deduplication capability and updates CLI routing and documentation.

Changes

Cohort / File(s) Summary
Codacy Credential & API Foundation
.agents/scripts/quality-sweep-helper.sh
Adds load_codacy_token() for multi-tier credential resolution (env/gopass/credentials.sh) and codacy_api_call() for authenticated HTTP requests with error handling.
Codacy Issue Ingestion Pipeline
.agents/scripts/quality-sweep-helper.sh
Implements fetch_codacy_issues() with cursor-based pagination, run initialization, field normalization, SQL-driven finding insertion, and run statistics updates; includes page-limiting safeguards and comprehensive error handling.
Codacy Command Handlers
.agents/scripts/quality-sweep-helper.sh
Adds cmd_codacy() with subcommand routing and implementations for cmd_codacy_query() (filtered queries), cmd_codacy_summary() (severity/type/rule/file aggregates), cmd_codacy_export() (JSON/CSV export), and cmd_codacy_status() (dependency/auth/database checks).
Cross-Source Deduplication
.agents/scripts/quality-sweep-helper.sh
Introduces cmd_dedup() to identify and manage duplicate findings across multiple sources via file+line+type matching.
CLI Integration & Documentation
.agents/scripts/quality-sweep-helper.sh
Updates show_help() to document Codacy and dedup features with sample usage; extends main() routing to handle codacy and dedup commands.

Sequence Diagram

sequenceDiagram
    participant User as CLI User
    participant Main as main()
    participant Cred as load_codacy_token()
    participant API as codacy_api_call()
    participant Fetch as fetch_codacy_issues()
    participant DB as SQL Database
    participant Codacy as Codacy API

    User->>Main: codacy fetch
    Main->>Cred: Load credentials
    Cred-->>Main: Token loaded
    Main->>Fetch: Initialize sweep run
    Fetch->>DB: Create run record
    DB-->>Fetch: Run ID
    
    loop Pagination (cursor-based)
        Fetch->>API: Request issues page
        API->>Codacy: GET /issues with cursor
        Codacy-->>API: Issues batch + next cursor
        API-->>Fetch: Issues batch
        Fetch->>Fetch: Normalize fields
        Fetch->>DB: INSERT findings
        DB-->>Fetch: Confirmation
    end
    
    Fetch->>DB: Update run statistics
    DB-->>Fetch: Complete
    Fetch-->>Main: Issues ingested
    Main-->>User: Success message
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🚀 Codacy joins the quality sweep—
API calls dance through data deep,
Pagination marks each finding's way,
While dedup keeps duplicates at bay,
A-grade automation holds the line,
Zero tech debt shines so fine! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 't245.2: Codacy API integration — fetch, normalize, deduplicate' directly and comprehensively summarizes the main changes: Codacy integration, data fetching, normalization, and deduplication logic.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/t245.2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 45 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Tue Feb 10 22:27:15 UTC 2026: Code review monitoring started
Tue Feb 10 22:27:15 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 45

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 45
  • VULNERABILITIES: 0

Generated on: Tue Feb 10 22:27:18 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@sonarqubecloud
Copy link
Copy Markdown

@marcusquinn marcusquinn merged commit 46929a1 into main Feb 10, 2026
9 of 10 checks passed
@marcusquinn marcusquinn deleted the feature/t245.2 branch February 10, 2026 22:31
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @.agents/scripts/quality-sweep-helper.sh:
- Around line 908-918: The SQL invocations interpolate $limit directly which
allows non-numeric values; update the CLI handlers that build queries (the
blocks using db "$SWEEP_DB" ... LIMIT $limit and the related commands
cmd_sonarcloud_query and cmd_dedup) to validate and sanitize $limit before use:
ensure $limit is a positive integer (e.g. reject or normalize values that do not
match ^[0-9]+$), set a safe default or exit with an error on invalid input, and
only pass the validated numeric value into the SQL string so no non-numeric or
;-separated payload can reach sqlite3.
- Around line 770-780: The API endpoint used to fetch repository issues is
missing the required "/search" suffix, so update the endpoint variable in the
quality-sweep helper (the local variable named endpoint in the block that calls
codacy_api_call) from
"/analysis/organizations/${provider}/${org}/repositories/${repo}/issues" to
include "/search" at the end; keep the surrounding error handling (the
codacy_api_call invocation, response variable, the db update using SWEEP_DB and
run_id, and the page_count extraction from response) unchanged so subsequent
parsing of response.data and pagination fields will work with the Codacy v3 API.
🧹 Nitpick comments (2)
.agents/scripts/quality-sweep-helper.sh (2)

872-921: Significant duplication — consider parameterizing query/summary/export by source.

cmd_codacy_query, cmd_codacy_summary, cmd_codacy_export, and cmd_codacy_status are near-identical copies of their cmd_sonarcloud_* counterparts, differing only in the source='codacy' filter and label strings. With CodeFactor and CodeRabbit integrations on the roadmap (t245.3+), each new source would require copying all four functions again.

A single parameterized set of functions accepting source as an argument would eliminate ~200 lines of duplication and make future integrations trivial:

♻️ Sketch of a parameterized approach
-cmd_codacy_query() {
-    ...
-    local where="WHERE source='codacy'"
-    ...
-}
-
-cmd_sonarcloud_query() {
-    ...
-    local where="WHERE source='sonarcloud'"
-    ...
-}
+cmd_source_query() {
+    local source="$1"; shift
+    ...
+    local where="WHERE source='$(echo "$source" | sed "s/'/''/g")'"
+    ...
+}

Then in routers:

# cmd_codacy
query) cmd_source_query "codacy" "$@" ;;

# cmd_sonarcloud
query) cmd_source_query "sonarcloud" "$@" ;;

1085-1085: Non-deterministic f.message in GROUP BY query.

The dedup_query selects f.message without aggregation while grouping by file, line, type. SQLite allows this (returning an arbitrary row's value), but the message shown could be from any contributing source. For JSON/CSV output (lines 1103, 1106) which includes message, this could be confusing if the SonarCloud and Codacy messages differ significantly.

Consider wrapping it in GROUP_CONCAT(DISTINCT f.message) or selecting the message from the highest-severity row. That said, for a dedup overview, showing any representative message is likely acceptable — so this is just a heads-up.

Comment on lines +770 to +780
local endpoint="/analysis/organizations/${provider}/${org}/repositories/${repo}/issues"
local response
response=$(codacy_api_call "$endpoint" "$params") || {
print_error "Failed to fetch page $page_num"
db "$SWEEP_DB" "UPDATE sweep_runs SET status='failed', completed_at=strftime('%Y-%m-%dT%H:%M:%SZ','now') WHERE id=$run_id;"
return 1
}

# Extract issues count on this page
local page_count
page_count=$(echo "$response" | jq '.data | length // 0')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

Codacy v3 API list repository issues endpoint documentation

💡 Result:

Codacy API v3 “list repository issues” is exposed as searchRepositoryIssues (it’s a search/list endpoint).

  • API v3 base URL: https://api.codacy.com/api/v3 (Codacy Cloud) [1]
  • API v3 reference (Swagger/UI): https://api.codacy.com/api/api-docs [1]

Endpoint (repository issues)

  • Method: POST
  • Path:
    • https://app.codacy.com/api/v3/analysis/organizations/{provider}/{organization}/repositories/{repository}/issues/search [2]
  • Auth header: api-token: <your account API token> [1]

Example request

curl -X POST "https://app.codacy.com/api/v3/analysis/organizations/gh/<org>/repositories/<repo>/issues/search" \
  -H "api-token: <token>" \
  -H "Content-Type: application/json" \
  -d '{ "levels": ["Error", "Warning"], "categories": ["Security"] }'

[2]

Sources: [1] [2]


🏁 Script executed:

cat -n .agents/scripts/quality-sweep-helper.sh | sed -n '750,800p'

Repository: marcusquinn/aidevops

Length of output: 2313


🏁 Script executed:

cat -n .agents/scripts/quality-sweep-helper.sh | sed -n '798,850p'

Repository: marcusquinn/aidevops

Length of output: 3495


🏁 Script executed:

grep -n "codacy_api_call" .agents/scripts/quality-sweep-helper.sh | head -20

Repository: marcusquinn/aidevops

Length of output: 154


🏁 Script executed:

cat -n .agents/scripts/quality-sweep-helper.sh | sed -n '685,710p'

Repository: marcusquinn/aidevops

Length of output: 948


🏁 Script executed:

grep -n "CODACY_API_URL" .agents/scripts/quality-sweep-helper.sh

Repository: marcusquinn/aidevops

Length of output: 171


Fix Codacy v3 API endpoint path: add missing /search suffix.

The endpoint at line 770 is incomplete. According to Codacy v3 API documentation, the issues endpoint requires the /search suffix:

Current:

/analysis/organizations/${provider}/${org}/repositories/${repo}/issues

Should be:

/analysis/organizations/${provider}/${org}/repositories/${repo}/issues/search

The response structure assumptions (.data[] for issues and .pagination.cursor for pagination) and field names (.issueId, .commitIssueId, .filePath, .lineNumber, .severity, .patternCategory, .patternId, .message, .language, .createdAt) align with the Codacy v3 API specification. However, without the correct endpoint path, the API calls will fail.

🤖 Prompt for AI Agents
In @.agents/scripts/quality-sweep-helper.sh around lines 770 - 780, The API
endpoint used to fetch repository issues is missing the required "/search"
suffix, so update the endpoint variable in the quality-sweep helper (the local
variable named endpoint in the block that calls codacy_api_call) from
"/analysis/organizations/${provider}/${org}/repositories/${repo}/issues" to
include "/search" at the end; keep the surrounding error handling (the
codacy_api_call invocation, response variable, the db update using SWEEP_DB and
run_id, and the page_count extraction from response) unchanged so subsequent
parsing of response.data and pagination fields will work with the Codacy v3 API.

Comment on lines +908 to +918
case "$format" in
json)
db "$SWEEP_DB" -json "SELECT file, line, severity, type, rule, message, status, effort, tags, created_at FROM findings $where ORDER BY CASE severity WHEN 'critical' THEN 1 WHEN 'high' THEN 2 WHEN 'medium' THEN 3 WHEN 'low' THEN 4 ELSE 5 END, file, line LIMIT $limit;"
;;
csv)
db "$SWEEP_DB" -header -csv "SELECT file, line, severity, type, rule, message, status, effort, tags, created_at FROM findings $where ORDER BY CASE severity WHEN 'critical' THEN 1 WHEN 'high' THEN 2 WHEN 'medium' THEN 3 WHEN 'low' THEN 4 ELSE 5 END, file, line LIMIT $limit;"
;;
table|*)
echo ""
db "$SWEEP_DB" -header -column "SELECT file, line, severity, rule, message FROM findings $where ORDER BY CASE severity WHEN 'critical' THEN 1 WHEN 'high' THEN 2 WHEN 'medium' THEN 3 WHEN 'low' THEN 4 ELSE 5 END, file, line LIMIT $limit;"
;;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

$limit is interpolated directly into SQL without numeric validation.

If a user passes a non-numeric --limit value, it could produce confusing SQLite errors or, in a contrived case, execute unintended SQL statements (since sqlite3 processes multiple ;-separated statements). This also applies to cmd_sonarcloud_query (pre-existing) and cmd_dedup.

As a local CLI tool the security risk is negligible (the user has direct DB access anyway), but for robustness, validate that $limit is numeric:

🛡️ Proposed validation
+        if ! [[ "$limit" =~ ^[0-9]+$ ]]; then
+            print_error "Invalid --limit value: $limit (must be a positive integer)"
+            return 1
+        fi
+
         case "$format" in
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
case "$format" in
json)
db "$SWEEP_DB" -json "SELECT file, line, severity, type, rule, message, status, effort, tags, created_at FROM findings $where ORDER BY CASE severity WHEN 'critical' THEN 1 WHEN 'high' THEN 2 WHEN 'medium' THEN 3 WHEN 'low' THEN 4 ELSE 5 END, file, line LIMIT $limit;"
;;
csv)
db "$SWEEP_DB" -header -csv "SELECT file, line, severity, type, rule, message, status, effort, tags, created_at FROM findings $where ORDER BY CASE severity WHEN 'critical' THEN 1 WHEN 'high' THEN 2 WHEN 'medium' THEN 3 WHEN 'low' THEN 4 ELSE 5 END, file, line LIMIT $limit;"
;;
table|*)
echo ""
db "$SWEEP_DB" -header -column "SELECT file, line, severity, rule, message FROM findings $where ORDER BY CASE severity WHEN 'critical' THEN 1 WHEN 'high' THEN 2 WHEN 'medium' THEN 3 WHEN 'low' THEN 4 ELSE 5 END, file, line LIMIT $limit;"
;;
if ! [[ "$limit" =~ ^[0-9]+$ ]]; then
print_error "Invalid --limit value: $limit (must be a positive integer)"
return 1
fi
case "$format" in
json)
db "$SWEEP_DB" -json "SELECT file, line, severity, type, rule, message, status, effort, tags, created_at FROM findings $where ORDER BY CASE severity WHEN 'critical' THEN 1 WHEN 'high' THEN 2 WHEN 'medium' THEN 3 WHEN 'low' THEN 4 ELSE 5 END, file, line LIMIT $limit;"
;;
csv)
db "$SWEEP_DB" -header -csv "SELECT file, line, severity, type, rule, message, status, effort, tags, created_at FROM findings $where ORDER BY CASE severity WHEN 'critical' THEN 1 WHEN 'high' THEN 2 WHEN 'medium' THEN 3 WHEN 'low' THEN 4 ELSE 5 END, file, line LIMIT $limit;"
;;
table|*)
echo ""
db "$SWEEP_DB" -header -column "SELECT file, line, severity, rule, message FROM findings $where ORDER BY CASE severity WHEN 'critical' THEN 1 WHEN 'high' THEN 2 WHEN 'medium' THEN 3 WHEN 'low' THEN 4 ELSE 5 END, file, line LIMIT $limit;"
;;
🤖 Prompt for AI Agents
In @.agents/scripts/quality-sweep-helper.sh around lines 908 - 918, The SQL
invocations interpolate $limit directly which allows non-numeric values; update
the CLI handlers that build queries (the blocks using db "$SWEEP_DB" ... LIMIT
$limit and the related commands cmd_sonarcloud_query and cmd_dedup) to validate
and sanitize $limit before use: ensure $limit is a positive integer (e.g. reject
or normalize values that do not match ^[0-9]+$), set a safe default or exit with
an error on invalid input, and only pass the validated numeric value into the
SQL string so no non-numeric or ;-separated payload can reach sqlite3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant