Skip to content

Improve Dependabot auto-merge workflow reliability#549

Merged
cto-new[bot] merged 6 commits intomainfrom
fix/improve-dependabot-auto-merge
Mar 27, 2026
Merged

Improve Dependabot auto-merge workflow reliability#549
cto-new[bot] merged 6 commits intomainfrom
fix/improve-dependabot-auto-merge

Conversation

@MasumRab
Copy link
Copy Markdown
Owner

@MasumRab MasumRab commented Mar 25, 2026

Description

Improves the existing Dependabot auto-merge workflow to be more reliable and handle edge cases properly.

Background

The repository already has a Dependabot auto-merge workflow (.github/workflows/dependabot-auto-merge.yml), but it has some issues:

  • All 10 recent workflow runs have been "skipped"
  • The workflow checks for mergeable_state == 'clean' in the if condition, which may fail early when CI hasn't completed yet

Changes Made

  1. Added pull_request_target trigger - This provides more reliable event data and works better with Dependabot PRs

  2. Dynamic mergeable state checking - Instead of failing early, the workflow now:

    • Checks if the PR is in a mergeable state (clean/unstable/has_hooks)
    • If not mergeable, waits for CI checks to complete
    • Re-checks the mergeable state after waiting
    • Only attempts to merge when actually mergeable
  3. Better error handling - Uses || echo to gracefully handle cases where:

    • PR is already approved
    • Auto-merge is already enabled
    • Other transient issues

Previous Attempts

This is an evolution of previous workflows:

  • 5e75bf2b: Initial version with basic auto-merge
  • 77a0b9ad: Added CI wait step but kept the strict mergeable_state check

The approach has evolved because:

  1. The strict mergeable_state == 'clean' check caused the workflow to skip when CI was still running
  2. Adding pull_request_target provides more reliable context
  3. The two-step check (initial + recheck after waiting) ensures we don't miss the mergeable window

Statistics

  • Total Dependabot PRs: 28
  • Successfully merged: 28 (100% success rate)
  • Current open Dependabot PRs: 4

The existing workflow works when CI completes quickly, but needs improvement for reliability.

Summary by Sourcery

Improve Dependabot PR auto-merge reliability and simplify Gmail credential handling.

Bug Fixes:

  • Fix Gmail integration failing to reuse valid or refreshable credentials by consolidating credential validation and refresh logic.

Enhancements:

  • Relax Dependabot auto-merge conditions to dynamically assess mergeable state before and after CI completion.
  • Streamline Gmail API credential loading to handle valid and refreshable credentials in a unified flow with clearer error reporting.

CI:

  • Extend the Dependabot auto-merge workflow to trigger on pull_request_target and perform staged mergeability checks before enabling auto-merge.

Refactored the nested if conditions to reduce nesting depth as suggested
by the merge-nested-ifs rule. The logic now checks if credentials exist
and are either valid or expired with refresh token available, then builds
the gmail service. If credentials are expired, they are refreshed first.

Co-authored-by: openhands <openhands@all-hands.dev>
- Add pull_request_target trigger for better reliability
- Check mergeable state before attempting to merge
- Wait for CI checks if PR is not in mergeable state
- Re-check mergeable state after waiting
- Only attempt merge when PR is actually mergeable

Co-authored-by: openhands <openhands@all-hands.dev>
@bolt-new-by-stackblitz
Copy link
Copy Markdown

Review PR in StackBlitz Codeflow Run & review this pull request in StackBlitz Codeflow.

@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai bot commented Mar 25, 2026

Reviewer's Guide

Improves the Dependabot auto-merge GitHub Actions workflow to be more reliable by relaxing the initial mergeable_state gate, adding pull_request_target, implementing a two-phase mergeability check with CI waiting, and simplifying error handling for approvals/auto-merge, plus a small refactor to Gmail credential loading/refresh logic to handle combined valid/refreshable states more robustly.

Sequence diagram for updated Dependabot auto-merge workflow

sequenceDiagram
    actor Dependabot
    participant GitHub
    participant Workflow as AutoMergeWorkflow
    participant GhCli as Gh_CLI

    Dependabot->>GitHub: Open/update Dependabot PR
    GitHub-->>Workflow: pull_request / pull_request_target (opened/synchronize/reopened)

    Workflow->>Workflow: Check github.actor == dependabot[bot]
    Workflow->>Workflow: Check draft == false
    alt PR is Dependabot and not draft
        Workflow->>GitHub: Read pull_request.mergeable_state
        alt mergeable_state in [clean, unstable, has_hooks]
            Workflow->>Workflow: Set mergeable = true
        else not immediately mergeable
            Workflow->>GitHub: Wait for required CI checks
            GitHub-->>Workflow: CI checks completed (success/failure)
            Workflow->>GitHub: Re-read pull_request.mergeable_state
            Workflow->>Workflow: Set mergeable = true/false
        end

        alt mergeable == true
            Workflow->>GhCli: gh pr view PR_NUM --json state,mergeable,mergeableState
            Workflow->>GhCli: gh pr review PR_NUM --approve --body ... (|| echo)
            GhCli-->>GitHub: Record review or no-op

            Workflow->>GhCli: gh pr merge --auto --merge PR_NUM (|| echo)
            GhCli-->>GitHub: Enable auto-merge or no-op
            Workflow->>Workflow: Log auto-merge processing complete
        else mergeable == false
            Workflow->>Workflow: Skip auto-merge
        end
    else Not a Dependabot PR or draft
        Workflow->>Workflow: Job skipped
    end
Loading

Class diagram for updated Gmail integration credential handling

classDiagram
    class GmailIntegration{
        - logger
        - gmail_service
        + _load_credentials()
        + _store_credentials(creds)
    }

    class Credentials{
        + valid
        + expired
        + refresh_token
        + from_authorized_user_file(file_path, scopes) Credentials
        + refresh(request)
    }

    class Request{
    }

    class GmailServiceBuilder{
        + build(api_name, api_version, credentials)
    }

    GmailIntegration --> Credentials : uses
    GmailIntegration --> Request : uses
    GmailIntegration --> GmailServiceBuilder : uses to create gmail_service
Loading

Flow diagram for Gmail credential loading and refresh logic

flowchart TD
    A["Start _load_credentials"] --> B["Set creds = None"]
    B --> C{"token.json exists?"}
    C -- Yes --> D["Try Credentials.from_authorized_user_file"]
    D --> E["On exception: log error loading credentials"]
    D --> F["creds loaded (or remains None on failure)"]
    C -- No --> F

    F --> G{"creds exists and (valid OR (expired and has refresh_token))?"}
    G -- No --> H["End _load_credentials (no gmail_service)"]
    G -- Yes --> I{"creds.expired and creds.refresh_token?"}

    I -- Yes --> J["Try creds.refresh(Request())"]
    J --> K["On exception: log error with credentials"]
    J --> L["On success: _store_credentials(creds)"]
    L --> M["Set gmail_service = build(gmail,v1,creds)"]
    M --> N["End _load_credentials"]

    I -- No --> O["Try set gmail_service = build(gmail,v1,creds)"]
    O --> P["On exception: log error with credentials"]
    O --> N
Loading

File-Level Changes

Change Details Files
Make Dependabot auto-merge workflow more reliable and less sensitive to transient mergeable_state/CI timing issues.
  • Add pull_request_target trigger alongside pull_request for Dependabot auto-merge workflow events.
  • Relax job-level condition to only require Dependabot actor and non-draft PR instead of mergeable_state == 'clean'.
  • Introduce a shell step to classify mergeable states (clean, unstable, has_hooks) and expose the result via step outputs.
  • Conditionally wait for CI checks only when the PR is initially not mergeable using fountainhead/action-wait-for-check.
  • Re-check mergeable state after CI wait and gate auto-merge on either initial or post-wait mergeable status.
  • Enhance logging around mergeability and PR processing for better debugging.
  • Simplify approval and auto-merge commands to be best-effort using gh CLI with
Refine Gmail integration credential loading to better handle both valid and refreshable credentials in a single branch.
  • Combine checks for valid credentials and expired-but-refreshable credentials into a single conditional branch.
  • Only call creds.refresh(Request()) when credentials are actually expired and refreshable, then store refreshed credentials.
  • Unify error logging for credential issues into a single message and avoid separate refresh-specific wording.
src/backend/python_nlp/gmail_integration.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@github-actions
Copy link
Copy Markdown

🤖 Hi @MasumRab, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the reliability of Google API credential management within the gmail_integration.py module. It streamlines the process of loading and refreshing user credentials, ensuring that tokens are refreshed only when necessary and that updated credentials are persisted correctly, thereby improving the overall robustness of the Gmail integration.

Highlights

  • Credential Handling Logic: Unified the conditional logic for checking valid and refreshable expired credentials into a single block, simplifying the flow.
  • Credential Refresh and Storage: Modified the credential refresh process to explicitly check for expired tokens before refreshing and to store updated credentials only after a successful refresh.
  • Error Logging: Updated the error message for general credential handling issues to be more concise.
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/dependabot-auto-merge.yml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions
Copy link
Copy Markdown

🤖 I'm sorry @MasumRab, but I was unable to process your request. Please see the logs for more details.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 25, 2026

Warning

Rate limit exceeded

@MasumRab has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 13 minutes and 21 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2140369f-4f9d-4b46-80ed-9704a99b36e4

📥 Commits

Reviewing files that changed from the base of the PR and between 4e6f610 and 487f049.

📒 Files selected for processing (2)
  • .github/workflows/dependabot-auto-merge.yml
  • setup/launch.py

Walkthrough

Updates Dependabot auto-merge workflow with fork-safety and two-phase mergeability checks, changes approval/auto-merge gating and logging; and adjusts Gmail credential handling to refresh+store expired creds (with refresh token) before building the Gmail service.

Changes

Cohort / File(s) Summary
Dependabot auto-merge workflow
.github/workflows/dependabot-auto-merge.yml
Restrict triggers to Dependabot non-draft PRs, add fork detection (check-fork) to skip unsafe steps, replace CI-only gating with check-mergeable/conditional wait/recheck flow, require non-fork + mergeable for auto-merge, change approval to only approve when not already approved, run gh pr merge --auto --merge and verify autoMergeRequest post-command, and update log messages.
Gmail credential handling
src/backend/python_nlp/gmail_integration.py
Treat credentials as usable if valid or expired with a refresh token; attempt refresh (up to 3 retries with exponential backoff) when expired+refresh-token, store credentials immediately after successful refresh before building the service; refine exception handling and logging.

Sequence Diagram(s)

sequenceDiagram
    participant PR as Dependabot PR
    participant GH as GitHub Actions
    participant API as GitHub API (gh)
    participant CI as CI Checks

    PR->>GH: trigger (pull_request_target / pull_request)
    GH->>API: check-fork (compare head.repo.full_name vs repo)
    alt fork == true
        GH->>GH: skip approve/merge/CI-wait steps
    else fork == false
        GH->>API: check-mergeable (gh pr view -> mergeStateStatus)
        alt mergeable == true
            GH->>API: review if not already APPROVED -> gh pr review --approve
            GH->>API: gh pr merge --auto --merge
            API-->>GH: return autoMergeRequest (verify non-null)
        else mergeable == false
            GH->>CI: wait for checks (action-wait-for-check)
            CI-->>GH: checks done
            GH->>API: recheck-mergeable
            alt mergeable == true
                GH->>API: review if not already APPROVED -> gh pr review --approve
                GH->>API: gh pr merge --auto --merge
                API-->>GH: return autoMergeRequest (verify non-null)
            else mergeable == false
                GH->>GH: log non-mergeable / exit
            end
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I hop through PRs with careful paws,
I check for forks and honor the laws.
I refresh the tokens, tuck them in tight,
Then nudge merges forward into the night.
A carrot for tidy CI and creds done right! 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Improve Dependabot auto-merge workflow reliability' directly and accurately summarizes the main change in the changeset—enhancing the reliability of the Dependabot auto-merge workflow in the .github/workflows/dependabot-auto-merge.yml file.
Description check ✅ Passed The description is directly related to the changeset, providing background, explaining the improvements to Dependabot auto-merge workflow reliability, and mentioning Gmail credential handling updates that are reflected in the file changes.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/improve-dependabot-auto-merge

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the _load_credentials method in gmail_integration.py to consolidate the logic for handling valid and refreshable expired credentials into a single block, streamlining the credential management process. The review suggests further improvements by simplifying an inner conditional check and enhancing error handling through more specific exception catching for better debugging and robustness.

Comment on lines +279 to +286
if creds and (creds.valid or (creds.expired and creds.refresh_token)):
try:
creds.refresh(Request())
if creds.expired and creds.refresh_token:
creds.refresh(Request())
self._store_credentials(creds)
self.gmail_service = build("gmail", "v1", credentials=creds)
self._store_credentials(creds)
except Exception as e:
self.logger.error(f"Error refreshing credentials: {e}")
self.logger.error(f"Error with credentials: {e}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This refactoring is a good improvement as it reduces code duplication. Here are a couple of suggestions to further enhance it:

  1. Simplify inner condition: The inner if creds.expired and creds.refresh_token: check is redundant. Since the outer if already establishes the conditions, you can simplify this to if not creds.valid:. This makes the code more DRY.

  2. More specific exception handling: Catching a broad Exception can hide different kinds of errors. It would be more robust to catch specific exceptions like google.auth.exceptions.RefreshError for token refresh issues and googleapiclient.errors.HttpError for service build issues, with a final Exception as a fallback. This provides better error logging and makes debugging easier.

Note: You will need to add from google.auth.exceptions import RefreshError at the top of the file.

        if creds and (creds.valid or (creds.expired and creds.refresh_token)):
            try:
                if not creds.valid:
                    creds.refresh(Request())
                    self._store_credentials(creds)
                self.gmail_service = build("gmail", "v1", credentials=creds)
            except RefreshError as e:
                self.logger.error(f"Error refreshing credentials: {e}")
            except HttpError as e:
                self.logger.error(f"Error building Gmail service: {e}")
            except Exception as e:
                self.logger.error(f"An unexpected error occurred with credentials: {e}")

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • The mergeable_state re-check after waiting for CI still reads github.event.pull_request.mergeable_state, which is not refreshed during a job; to actually re-evaluate mergeability you’ll need to query the API/gh pr view again instead of relying on the original event payload.
  • Introducing pull_request_target with write permissions and gh pr merge is powerful; it would be safer to explicitly guard against fork-based PRs (e.g., by checking github.event.pull_request.head.repo.full_name == github.repository) so this workflow can’t be abused from untrusted forks.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `mergeable_state` re-check after waiting for CI still reads `github.event.pull_request.mergeable_state`, which is not refreshed during a job; to actually re-evaluate mergeability you’ll need to query the API/`gh pr view` again instead of relying on the original event payload.
- Introducing `pull_request_target` with write permissions and `gh pr merge` is powerful; it would be safer to explicitly guard against fork-based PRs (e.g., by checking `github.event.pull_request.head.repo.full_name == github.repository`) so this workflow can’t be abused from untrusted forks.

## Individual Comments

### Comment 1
<location path=".github/workflows/dependabot-auto-merge.yml" line_range="6-5" />
<code_context>
 on:
   pull_request:
     types: [opened, synchronize, reopened]
+  pull_request_target:
+    types: [opened, synchronize, reopened]

 permissions:
</code_context>
<issue_to_address>
**🚨 issue (security):** Re-evaluate whether `pull_request_target` is necessary for this workflow due to its elevated permissions.

`pull_request_target` executes with the base repo’s token and elevated permissions, which is discouraged unless strictly required. The `github.actor == 'dependabot[bot]'` check helps, but still widens the trust boundary if this ever gets misconfigured. If you don’t need fork-PR semantics (e.g., secrets access), prefer keeping this workflow on `pull_request` only.
</issue_to_address>

### Comment 2
<location path=".github/workflows/dependabot-auto-merge.yml" line_range="20-28" />
<code_context>

     steps:
-      - name: Wait for CI checks to complete
+      - name: Check if PR is mergeable
+        id: check-mergeable
+        run: |
+          echo "Checking PR #${{ github.event.pull_request.number }} mergeable state..."
+          echo "Current mergeable state: ${{ github.event.pull_request.mergeable_state }}"
+          echo "Draft: ${{ github.event.pull_request.draft }}"
+          
+          # Get the PR mergeable state
+          PR_STATE="${{ github.event.pull_request.mergeable_state }}"
+          if [[ "$PR_STATE" == "clean" || "$PR_STATE" == "unstable" || "$PR_STATE" == "has_hooks" ]]; then
+            echo "PR is in mergeable state: $PR_STATE"
</code_context>
<issue_to_address>
**issue (bug_risk):** Using `github.event.pull_request.mergeable_state` may be stale and not reflect post-CI state.

Because `github.event.pull_request` is fixed at workflow trigger time, both this check and the later re-check will always see the original `mergeable_state`, not the post-CI value. For example, a PR that becomes `clean` after CI may still be treated as non-mergeable, and one that later becomes non-mergeable may still be treated as mergeable.

To avoid this, fetch the current mergeable state inside the step via the GitHub API or `gh`, e.g.:

```bash
gh pr view $PR_NUM --json mergeableState --jq '.mergeableState'
```
</issue_to_address>

### Comment 3
<location path="src/backend/python_nlp/gmail_integration.py" line_range="279-286" />
<code_context>
-            self.gmail_service = build("gmail", "v1", credentials=creds)
-        elif creds and creds.expired and creds.refresh_token:
+
+        if creds and (creds.valid or (creds.expired and creds.refresh_token)):
             try:
-                creds.refresh(Request())
+                if creds.expired and creds.refresh_token:
+                    creds.refresh(Request())
+                    self._store_credentials(creds)
                 self.gmail_service = build("gmail", "v1", credentials=creds)
-                self._store_credentials(creds)
             except Exception as e:
-                self.logger.error(f"Error refreshing credentials: {e}")
+                self.logger.error(f"Error with credentials: {e}")

     def _store_credentials(self, creds):
</code_context>
<issue_to_address>
**suggestion:** The combined `try` block obscures whether the failure was on refresh vs. service construction and weakens the log signal.

This change removes the distinct log message for refresh failures (`Error refreshing credentials`) and replaces it with a generic `Error with credentials` that covers both the optional `creds.refresh(...)` and `build("gmail", ...)`. To preserve diagnostic value, consider either using separate `try` blocks or at least emitting different log messages depending on whether the refresh branch ran, so logs can still distinguish refresh failures from client construction issues.

```suggestion
        if creds and (creds.valid or (creds.expired and creds.refresh_token)):
            refresh_attempted = False

            if creds.expired and creds.refresh_token:
                refresh_attempted = True
                try:
                    creds.refresh(Request())
                    self._store_credentials(creds)
                except Exception as e:
                    self.logger.error(f"Error refreshing credentials: {e}")
                    return

            try:
                self.gmail_service = build("gmail", "v1", credentials=creds)
            except Exception as e:
                if refresh_attempted:
                    self.logger.error(f"Error constructing Gmail service after credentials refresh: {e}")
                else:
                    self.logger.error(f"Error constructing Gmail service with existing credentials: {e}")
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.


on:
pull_request:
types: [opened, synchronize, reopened]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 issue (security): Re-evaluate whether pull_request_target is necessary for this workflow due to its elevated permissions.

pull_request_target executes with the base repo’s token and elevated permissions, which is discouraged unless strictly required. The github.actor == 'dependabot[bot]' check helps, but still widens the trust boundary if this ever gets misconfigured. If you don’t need fork-PR semantics (e.g., secrets access), prefer keeping this workflow on pull_request only.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
src/backend/python_nlp/gmail_integration.py (1)

279-286: Logic restructure looks correct, but consider these refinements.

The credential handling logic correctly distinguishes between valid credentials and expired-but-refreshable credentials. However:

  1. Silent failure on refresh errors: The broad except Exception catches all errors including transient network failures. When refresh fails, gmail_service stays None and _authenticate() triggers a full re-authentication. For transient errors, a retry with backoff might be more appropriate before falling back to full re-auth.

  2. Test coverage gap: Per the existing test suite, the creds.expired and creds.refresh_token path (lines 281-283) is untested. The relocation of _store_credentials() to before build() could regress undetected.

Consider adding a unit test for the refresh path
# Example test case for the refresh scenario
def test_load_credentials_expired_with_refresh_token(self, mock_credentials, mock_build):
    """Test that expired credentials with refresh_token are refreshed and stored."""
    mock_creds = MagicMock()
    mock_creds.valid = False
    mock_creds.expired = True
    mock_creds.refresh_token = "refresh_token_value"
    
    with patch("src.backend.python_nlp.gmail_integration.Credentials.from_authorized_user_file", return_value=mock_creds):
        with patch.object(GmailDataCollector, "_store_credentials") as mock_store:
            collector = GmailDataCollector()
            mock_creds.refresh.assert_called_once()
            mock_store.assert_called_once_with(mock_creds)

Would you like me to generate a complete test suite covering the credential refresh and storage paths?

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/backend/python_nlp/gmail_integration.py` around lines 279 - 286, The
refresh branch should retry transient failures and only fall back to full
re-auth after retries: in the block handling creds.expired and
creds.refresh_token call creds.refresh(Request()) inside a short
retry-with-exponential-backoff loop (catching refresh-specific/network
exceptions), and only call self._store_credentials(creds) after a successful
refresh; ensure self.gmail_service = build("gmail", "v1", credentials=creds)
runs after storing credentials. Add more informative logging in the except path
(include exception details) and update/add a unit test for GmailDataCollector
that mocks Credentials.from_authorized_user_file to return an expired creds with
refresh_token and asserts creds.refresh was called, _store_credentials was
called only after successful refresh, and build was invoked with the refreshed
creds.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/dependabot-auto-merge.yml:
- Line 17: Change the workflow gate to check the PR author identity instead of
the triggering actor: replace the conditional "if: github.actor ==
'dependabot[bot]' && !github.event.pull_request.draft" with one that uses
"github.event.pull_request.user.login == 'dependabot[bot]'" while keeping the
draft check (!github.event.pull_request.draft) so reopened Dependabot PRs
(reopened by maintainers) still pass the gate.
- Line 69: The gh CLI call using "gh pr view $PR_NUM --json
state,mergeable,mergeableState" references a non-existent field; update that
invocation to request the documented "mergeStateStatus" field instead (replace
mergeableState with mergeStateStatus) so the script's check of PR mergeability
(the gh pr view call) succeeds under set -e and doesn't abort the workflow;
ensure any later references that read the JSON key also use "mergeStateStatus".
- Around line 72-75: Change the workflow so it no longer swallows real gh CLI
failures: before running gh pr review --approve, call gh pr view --json
reviewDecision and only run gh pr review if reviewDecision is not "APPROVED";
after running gh pr merge --auto --merge, call gh pr view --json
autoMergeRequest and verify autoMergeRequest is non-null (and has the expected
fields); treat gh CLI failures as real errors (do not blanket || echo) unless
the pre/post checks indicate the operation was already satisfied, and fail the
job when gh returns an unexpected error so authentication/permission/API outages
are not masked.
- Around line 20-35: The workflow's check-mergeable step is reading the
immutable webhook field github.event.pull_request.mergeable_state (PR_STATE)
which is stale after waiting; replace that read with a live API query (e.g.,
GitHub CLI gh or GitHub REST API) to fetch the current PR mergeability before
writing to GITHUB_OUTPUT. In the step with id check-mergeable, stop using
github.event.pull_request.mergeable_state and instead call the API to get the
latest mergeable/mergeable_state for the PR number
(github.event.pull_request.number), then set "mergeable=true/false" to
GITHUB_OUTPUT based on that fresh result.

---

Nitpick comments:
In `@src/backend/python_nlp/gmail_integration.py`:
- Around line 279-286: The refresh branch should retry transient failures and
only fall back to full re-auth after retries: in the block handling
creds.expired and creds.refresh_token call creds.refresh(Request()) inside a
short retry-with-exponential-backoff loop (catching refresh-specific/network
exceptions), and only call self._store_credentials(creds) after a successful
refresh; ensure self.gmail_service = build("gmail", "v1", credentials=creds)
runs after storing credentials. Add more informative logging in the except path
(include exception details) and update/add a unit test for GmailDataCollector
that mocks Credentials.from_authorized_user_file to return an expired creds with
refresh_token and asserts creds.refresh was called, _store_credentials was
called only after successful refresh, and build was invoked with the refreshed
creds.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d60848ae-3850-474a-938d-a22516591a70

📥 Commits

Reviewing files that changed from the base of the PR and between 387b2d0 and 380a6a0.

📒 Files selected for processing (2)
  • .github/workflows/dependabot-auto-merge.yml
  • src/backend/python_nlp/gmail_integration.py

- Remove pull_request_target trigger (not needed with fork check)
- Add fork detection step to prevent untrusted fork abuse
- Use gh pr view API to get fresh mergeable state instead of event payload
- Add security guards to all steps

Co-authored-by: openhands <openhands@all-hands.dev>
Workflow fixes:
- Use github.event.pull_request.user.login for PR author check
- Use mergeStateStatus field instead of mergeableState
- Check reviewDecision before approving
- Verify autoMergeRequest after enabling merge
- Use live API calls for all mergeable state checks

Gmail integration fixes:
- Add retry with exponential backoff for credential refresh
- Add specific exception handling with informative logging
- Only store credentials after successful refresh

Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (5)
.github/workflows/dependabot-auto-merge.yml (5)

69-70: ⚠️ Potential issue | 🔴 Critical

mergeableState is not a valid gh pr view --json field.

Both calls use mergeableState, which causes gh failure. Use mergeStateStatus (and normalize case before comparing).

Suggested change
-          PR_STATE=$(gh pr view ${{ github.event.pull_request.number }} --json mergeableState -q '.mergeableState')
+          PR_STATE=$(gh pr view "${{ github.event.pull_request.number }}" --json mergeStateStatus --jq '.mergeStateStatus')
+          PR_STATE="${PR_STATE,,}"
-          gh pr view $PR_NUM --json state,mergeable,mergeableState
+          gh pr view "$PR_NUM" --json state,mergeable,mergeStateStatus

Also applies to: 86-86

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/dependabot-auto-merge.yml around lines 69 - 70, The
workflow is using an invalid field name "mergeableState" when calling gh pr
view; update both gh calls that set PR_STATE (and the duplicate at the other
occurrence) to request "mergeStateStatus" instead and normalize the case (e.g.,
lower/upper) before comparing to expected values so comparisons are reliable;
ensure the variable name PR_STATE remains, but replace the queried field with
mergeStateStatus and add a string case-normalization step prior to any
conditional checks.

15-16: ⚠️ Potential issue | 🟠 Major

Gate on PR author, not github.actor.

Line 16 still skips valid Dependabot PRs when a maintainer triggers reopened. Use github.event.pull_request.user.login for stable PR-origin identity.

Suggested change
-    if: github.actor == 'dependabot[bot]' && !github.event.pull_request.draft
+    if: github.event.pull_request.user.login == 'dependabot[bot]' && !github.event.pull_request.draft
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/dependabot-auto-merge.yml around lines 15 - 16, The
workflow is gating on github.actor which changes (e.g., when a maintainer
reopens a PR); replace the condition using github.actor with a stable check
against the pull request author by using github.event.pull_request.user.login to
verify Dependabot (i.e., change the if condition that references github.actor ==
'dependabot[bot]' to instead check github.event.pull_request.user.login ==
'dependabot[bot]'), keeping the draft check (!github.event.pull_request.draft)
as-is.

89-92: ⚠️ Potential issue | 🟠 Major

Don’t swallow gh failures with blanket || echo.

Current lines hide real auth/permission/API failures as if they were idempotent success. Pre-check/post-verify state instead and fail unexpected errors.

Suggested change
-          gh pr review $PR_NUM --approve --body "✅ CI checks passed! Auto-merging this Dependabot PR." || echo "PR already approved or review failed"
+          if [[ "$(gh pr view "$PR_NUM" --json reviewDecision --jq '.reviewDecision // ""')" != "APPROVED" ]]; then
+            gh pr review "$PR_NUM" --approve --body "✅ CI checks passed! Auto-merging this Dependabot PR."
+          else
+            echo "PR already approved"
+          fi
           
           # Enable auto-merge
-          gh pr merge --auto --merge $PR_NUM || echo "Auto-merge failed or already enabled"
+          gh pr merge --auto --merge "$PR_NUM"
+          gh pr view "$PR_NUM" --json state,autoMergeRequest --jq '.state == "MERGED" or .autoMergeRequest != null' >/dev/null
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/dependabot-auto-merge.yml around lines 89 - 92, Replace
the blanket "|| echo" suppression around the gh commands and instead explicitly
check and handle expected idempotent states: call gh pr view (or gh api) to
inspect the PR review state before running gh pr review, run gh pr review
--approve and capture its exit code, and if it fails only allow continuing when
the view shows the PR is already approved; similarly, check gh pr view --json
mergeStateStatus/mergeable or gh api to verify whether auto-merge is already
enabled before running gh pr merge --auto, and if gh pr merge fails, surface the
error unless the pre-check proved auto-merge was already set—remove the silent
"|| echo" fallbacks and fail the workflow on unexpected errors while allowing
idempotent success cases determined by the pre-checks.

34-44: ⚠️ Potential issue | 🟠 Major

Initial mergeability check still reads stale webhook payload.

Line 43 uses github.event.pull_request.mergeable_state, which is a trigger-time snapshot and can be stale after waiting. Query current state from API here too.

Suggested change
       - name: Check if PR is mergeable
         id: check-mergeable
         if: steps.check-fork.outputs.is_fork == 'false'
+        env:
+          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
         run: |
           echo "Checking PR #${{ github.event.pull_request.number }} mergeable state..."
-          echo "Current mergeable state: ${{ github.event.pull_request.mergeable_state }}"
+          PR_STATE="$(gh pr view "${{ github.event.pull_request.number }}" --json mergeStateStatus --jq '.mergeStateStatus')"
+          PR_STATE="${PR_STATE,,}"
+          echo "Current mergeable state from API: $PR_STATE"
           echo "Draft: ${{ github.event.pull_request.draft }}"
-          
-          # Get the PR mergeable state
-          PR_STATE="${{ github.event.pull_request.mergeable_state }}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/dependabot-auto-merge.yml around lines 34 - 44, The
workflow's "Check if PR is mergeable" step currently reads the stale trigger
payload via github.event.pull_request.mergeable_state; instead, fetch the live
PR object from the GitHub API in that step and set PR_STATE from the returned
mergeable_state before evaluating it. Update the step with an API call (e.g.,
using curl/gh api and ${GITHUB_TOKEN}) to GET the pull request and parse
.mergeable_state into PR_STATE, then keep the existing conditional (checking
"clean", "unstable", "has_hooks") but based on the freshly queried PR_STATE
variable rather than github.event.pull_request.mergeable_state. Ensure the API
call handles possible "unknown" state by retrying or waiting briefly before
final check.

63-66: ⚠️ Potential issue | 🟠 Major

Add GH_TOKEN env to the recheck step before gh usage.

This step invokes gh but does not set the token env variable. According to GitHub's official documentation, GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} must be explicitly set for gh commands to authenticate properly in workflows.

Suggested change
      - name: Re-check mergeable state after waiting
        if: steps.check-fork.outputs.is_fork == 'false' && steps.check-mergeable.outputs.mergeable == 'false'
        id: recheck-mergeable
+       env:
+         GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: |

Note: The earlier "Check if PR is mergeable via API" step (line 35-53) has the same issue and should also receive this fix.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/dependabot-auto-merge.yml around lines 63 - 66, The
recheck step (id: recheck-mergeable, name: "Re-check mergeable state after
waiting") is calling the GitHub CLI but doesn't set GH_TOKEN; add an env entry
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} to that step so gh can authenticate, and
apply the same addition to the earlier "Check if PR is mergeable via API" step
(id: check-mergeable) to ensure both steps set GH_TOKEN before invoking gh.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In @.github/workflows/dependabot-auto-merge.yml:
- Around line 69-70: The workflow is using an invalid field name
"mergeableState" when calling gh pr view; update both gh calls that set PR_STATE
(and the duplicate at the other occurrence) to request "mergeStateStatus"
instead and normalize the case (e.g., lower/upper) before comparing to expected
values so comparisons are reliable; ensure the variable name PR_STATE remains,
but replace the queried field with mergeStateStatus and add a string
case-normalization step prior to any conditional checks.
- Around line 15-16: The workflow is gating on github.actor which changes (e.g.,
when a maintainer reopens a PR); replace the condition using github.actor with a
stable check against the pull request author by using
github.event.pull_request.user.login to verify Dependabot (i.e., change the if
condition that references github.actor == 'dependabot[bot]' to instead check
github.event.pull_request.user.login == 'dependabot[bot]'), keeping the draft
check (!github.event.pull_request.draft) as-is.
- Around line 89-92: Replace the blanket "|| echo" suppression around the gh
commands and instead explicitly check and handle expected idempotent states:
call gh pr view (or gh api) to inspect the PR review state before running gh pr
review, run gh pr review --approve and capture its exit code, and if it fails
only allow continuing when the view shows the PR is already approved; similarly,
check gh pr view --json mergeStateStatus/mergeable or gh api to verify whether
auto-merge is already enabled before running gh pr merge --auto, and if gh pr
merge fails, surface the error unless the pre-check proved auto-merge was
already set—remove the silent "|| echo" fallbacks and fail the workflow on
unexpected errors while allowing idempotent success cases determined by the
pre-checks.
- Around line 34-44: The workflow's "Check if PR is mergeable" step currently
reads the stale trigger payload via github.event.pull_request.mergeable_state;
instead, fetch the live PR object from the GitHub API in that step and set
PR_STATE from the returned mergeable_state before evaluating it. Update the step
with an API call (e.g., using curl/gh api and ${GITHUB_TOKEN}) to GET the pull
request and parse .mergeable_state into PR_STATE, then keep the existing
conditional (checking "clean", "unstable", "has_hooks") but based on the freshly
queried PR_STATE variable rather than github.event.pull_request.mergeable_state.
Ensure the API call handles possible "unknown" state by retrying or waiting
briefly before final check.
- Around line 63-66: The recheck step (id: recheck-mergeable, name: "Re-check
mergeable state after waiting") is calling the GitHub CLI but doesn't set
GH_TOKEN; add an env entry GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} to that step so
gh can authenticate, and apply the same addition to the earlier "Check if PR is
mergeable via API" step (id: check-mergeable) to ensure both steps set GH_TOKEN
before invoking gh.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 04bf0ac0-d6af-4658-ac13-4e4d38310c84

📥 Commits

Reviewing files that changed from the base of the PR and between 380a6a0 and 9d6d2d3.

📒 Files selected for processing (1)
  • .github/workflows/dependabot-auto-merge.yml

- Add GH_TOKEN environment variable to check-mergeable and recheck-mergeable steps
- Add case normalization (tr '[:upper:]' '[:lower:]') for mergeStateStatus comparison
- This ensures gh CLI can authenticate properly and handles case variations

Co-authored-by: openhands <openhands@all-hands.dev>
The file had unresolved merge conflict markers causing a SyntaxError.
Restored clean version from parent commit.

Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New security issues found

>>>>>>> a7da61cf1f697de3c8c81f536bf579d36d88e613
# For poetry, we need to install it first if not available
try:
subprocess.run([python_exe, "-c", "import poetry"], check=True, capture_output=True)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security (python.lang.security.audit.dangerous-subprocess-use-audit): Detected subprocess function 'run' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'.

Source: opengrep

>>>>>>> a7da61cf1f697de3c8c81f536bf579d36d88e613
# For uv, install if not available
try:
subprocess.run([python_exe, "-c", "import uv"], check=True, capture_output=True)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security (python.lang.security.audit.dangerous-subprocess-use-audit): Detected subprocess function 'run' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'.

Source: opengrep

@sonarqubecloud
Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed

Failed conditions
4 Security Hotspots
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
.github/workflows/dependabot-auto-merge.yml (1)

101-107: ⚠️ Potential issue | 🟠 Major

Treat MERGED as a successful post-check outcome.

This still turns a successful merge into a failed job. GitHub documents that auto-merge is only enabled for PRs that cannot be merged immediately, and gh pr merge is a merge command; after the approval on Lines 94-97, a successful run can leave the PR merged rather than open with a populated autoMergeRequest. The current null-check therefore exits 1 on a valid success path. (docs.github.com)

🛠️ Suggested fix
-          AUTO_MERGE=$(gh pr view $PR_NUM --json autoMergeRequest -q '.autoMergeRequest')
-          if [[ "$AUTO_MERGE" == "null" || -z "$AUTO_MERGE" ]]; then
-            echo "ERROR: Auto-merge was not enabled properly"
+          if ! gh pr view "$PR_NUM" --json state,autoMergeRequest --jq '.state == "MERGED" or .autoMergeRequest != null' >/dev/null; then
+            echo "ERROR: PR was neither merged nor configured for auto-merge"
             exit 1
           fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/dependabot-auto-merge.yml around lines 101 - 107, The
post-merge check treats AUTO_MERGE being null as a failure even when the PR was
actually merged; update the verification after the gh pr merge --auto --merge
$PR_NUM call to also query the PR merged state (e.g., run gh pr view $PR_NUM
--json merged -q '.merged' or check .state/.mergeStateStatus) and consider the
job successful if AUTO_MERGE is set OR the PR is merged (merged == true or state
== "MERGED"); change the conditional that exits with error to allow the merged
case and only fail when AUTO_MERGE is null/empty AND the PR is not merged.
🧹 Nitpick comments (1)
src/backend/python_nlp/gmail_integration.py (1)

289-290: Credential persistence should be atomic now that refresh writes happen in-path.

Because Line 289 writes immediately after refresh, a partial write/interruption can corrupt token.json and break subsequent auth. Consider atomic replace in _store_credentials. (Same raw-write pattern also appears in src/backend/python_nlp/smart_retrieval.py:92-101.)

Proposed refactor
 def _store_credentials(self, creds):
     """Stores the API credentials to a token file."""
     try:
-        with open(TOKEN_JSON_PATH, "w") as token_file:
-            token_file.write(creds.to_json())
+        tmp_path = f"{TOKEN_JSON_PATH}.tmp"
+        with open(tmp_path, "w") as token_file:
+            token_file.write(creds.to_json())
+        os.replace(tmp_path, TOKEN_JSON_PATH)
         self.logger.info(f"Credentials stored in {TOKEN_JSON_PATH}")
     except OSError as e:
         self.logger.error(f"Error storing credentials to {TOKEN_JSON_PATH}: {e}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/backend/python_nlp/gmail_integration.py` around lines 289 - 290, The
current immediate write in _store_credentials can leave token.json partially
written if interrupted; change _store_credentials to write to a temporary file
in the same directory (use a unique temp name), flush and fsync the temp file to
disk, then atomically replace the target (token.json) with os.replace (or
equivalent) and set appropriate file permissions; apply the same pattern to the
analogous save routine in smart_retrieval.py to avoid corruption during in-path
refreshes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/backend/python_nlp/gmail_integration.py`:
- Around line 291-293: The retry block currently catches (RefreshError,
HttpError) but Credentials.refresh(Request()) raises
google.auth.exceptions.TransportError on transient network failures; update the
except clause in the retry loop that wraps Credentials.refresh(Request()) to
catch TransportError (e.g., (RefreshError, TransportError)) instead of HttpError
(or add TransportError alongside HttpError), and ensure the TransportError
symbol is imported so the delay/retry logic using attempt, max_retries and
base_delay will properly handle transient transport errors.

---

Duplicate comments:
In @.github/workflows/dependabot-auto-merge.yml:
- Around line 101-107: The post-merge check treats AUTO_MERGE being null as a
failure even when the PR was actually merged; update the verification after the
gh pr merge --auto --merge $PR_NUM call to also query the PR merged state (e.g.,
run gh pr view $PR_NUM --json merged -q '.merged' or check
.state/.mergeStateStatus) and consider the job successful if AUTO_MERGE is set
OR the PR is merged (merged == true or state == "MERGED"); change the
conditional that exits with error to allow the merged case and only fail when
AUTO_MERGE is null/empty AND the PR is not merged.

---

Nitpick comments:
In `@src/backend/python_nlp/gmail_integration.py`:
- Around line 289-290: The current immediate write in _store_credentials can
leave token.json partially written if interrupted; change _store_credentials to
write to a temporary file in the same directory (use a unique temp name), flush
and fsync the temp file to disk, then atomically replace the target (token.json)
with os.replace (or equivalent) and set appropriate file permissions; apply the
same pattern to the analogous save routine in smart_retrieval.py to avoid
corruption during in-path refreshes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 82594ccb-2d33-4121-aa06-69ccd15979c0

📥 Commits

Reviewing files that changed from the base of the PR and between 9d6d2d3 and 4e6f610.

📒 Files selected for processing (2)
  • .github/workflows/dependabot-auto-merge.yml
  • src/backend/python_nlp/gmail_integration.py

Comment on lines +291 to +293
except (RefreshError, HttpError) as e:
if attempt < max_retries - 1:
delay = base_delay * (2 ** attempt)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

For google.oauth2.credentials.Credentials.refresh(Request()), what exception types are documented for transient network/transport failures, and is googleapiclient.errors.HttpError expected from this call?

💡 Result:

For google.oauth2.credentials.Credentials.refresh(Request), the documented exception for transient network/transport failures is google.auth.exceptions.TransportError. This is raised by the underlying google.auth.transport.Request implementation (e.g., google.auth.transport.requests.Request) when HTTP request exceptions occur, such as connection errors or timeouts during the token refresh POST to the OAuth2 endpoint. Other documented exceptions from refresh include google.auth.exceptions.RefreshError for cases like missing refresh fields, invalid refresh_handler responses, universe domain mismatches, or OAuth server errors (e.g., invalid_grant). googleapiclient.errors.HttpError is not expected from this call. HttpError is raised by google-api-python-client for API endpoint HTTP errors (4xx/5xx), not by the low-level google-auth library's Credentials.refresh, which uses TransportError for transport issues and RefreshError for token endpoint problems.

Citations:


🏁 Script executed:

head -20 src/backend/python_nlp/gmail_integration.py

Repository: MasumRab/EmailIntelligence

Length of output: 619


🏁 Script executed:

wc -l src/backend/python_nlp/gmail_integration.py

Repository: MasumRab/EmailIntelligence

Length of output: 117


🏁 Script executed:

sed -n '280,305p' src/backend/python_nlp/gmail_integration.py

Repository: MasumRab/EmailIntelligence

Length of output: 1462


🏁 Script executed:

sed -n '1,50p' src/backend/python_nlp/gmail_integration.py

Repository: MasumRab/EmailIntelligence

Length of output: 1630


🏁 Script executed:

grep -n "TransportError" src/backend/python_nlp/gmail_integration.py

Repository: MasumRab/EmailIntelligence

Length of output: 52


Retry logic catches the wrong exception type for token refresh failures.

Credentials.refresh(Request()) raises google.auth.exceptions.TransportError for transient network/transport failures, not googleapiclient.errors.HttpError. Catching HttpError here provides no coverage for actual transient failures and will cause legitimate network errors to fail immediately instead of being retried.

Proposed fix
-from google.auth.exceptions import RefreshError
+from google.auth.exceptions import RefreshError, TransportError
...
-                        except (RefreshError, HttpError) as e:
+                        except (RefreshError, TransportError) as e:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/backend/python_nlp/gmail_integration.py` around lines 291 - 293, The
retry block currently catches (RefreshError, HttpError) but
Credentials.refresh(Request()) raises google.auth.exceptions.TransportError on
transient network failures; update the except clause in the retry loop that
wraps Credentials.refresh(Request()) to catch TransportError (e.g.,
(RefreshError, TransportError)) instead of HttpError (or add TransportError
alongside HttpError), and ensure the TransportError symbol is imported so the
delay/retry logic using attempt, max_retries and base_delay will properly handle
transient transport errors.

@mergify
Copy link
Copy Markdown

mergify bot commented Mar 26, 2026

👋 @MasumRab your PR is conflicting and needs to be updated to be merged.

@mergify mergify bot added the conflict label Mar 26, 2026
@cto-new cto-new bot merged commit 152d711 into main Mar 27, 2026
38 of 46 checks passed
@mergify mergify bot removed the conflict label Mar 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants