feat(integrations): Add GitHub repository platform detection#109699
feat(integrations): Add GitHub repository platform detection#109699
Conversation
|
🚨 Warning: This pull request contains Frontend and Backend changes! It's discouraged to make changes to Sentry's Frontend and Backend in a single pull request. The Frontend and Backend are not atomically deployed. If the changes are interdependent of each other, they must be separated into two pull requests and be made forward or backwards compatible, such that the Backend or Frontend can be safely deployed independently. Have questions? Please ask in the |
513ed5d to
41d47b7
Compare
4b8dc2a to
0d660b0
Compare
b15639d to
65cb88a
Compare
e78d4fc to
393982c
Compare
| else: | ||
| # Text-based manifest files: requirements.txt, Gemfile, | ||
| # pyproject.toml, build.gradle, pom.xml, go.mod | ||
| content_lower = content.lower() |
There was a problem hiding this comment.
Note: This substring matching is loose for text-based manifests. For example, "echo" in the Go dependency map will match anywhere the word appears in go.mod, not just as a module path.
This is addressed in PR 3 of this stack (#109701), which replaces the text-based detection with the composable FrameworkDef system and uses full Go module paths like github.com/labstack/echo.
| detectors = FRAMEWORK_DETECTORS.get(base_platform, []) | ||
| detected: list[str] = [] | ||
|
|
||
| for manifest_file, dependency_map in detectors: |
There was a problem hiding this comment.
Note: This loop makes a separate GitHub API call per manifest file per language (_get_repo_file_content inside the loop). For a repo with Python + JavaScript, that's up to 4 manifest fetches (requirements.txt, pyproject.toml, Pipfile, package.json).
PR 2 in this stack (#109700) eliminates this by fetching the root directory listing in a single API call, collecting all needed file paths upfront, and batch-fetching content.
Register `organizations:integrations-github-platform-detection` flagpole flag to gate the new platform detection endpoint behind a controlled rollout. This is the base of a 4-PR stack for GitHub platform detection: - **PR 0 (this):** Feature flag registration - [PR 1](#109699): Core detection endpoint - [PR 2](#109700): Composable framework definitions - [PR 3](#109701): Expand to 97% picker coverage Co-authored-by: Claude <noreply@anthropic.com>
3e914eb to
4b5fa36
Compare
src/sentry/integrations/api/endpoints/organization_repository_platforms.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Backend Test FailuresFailures on
|
Add a platform detection pipeline that maps GitHub repository languages
to Sentry platform IDs for onboarding. Given a repo, the system calls
GitHub's Languages API, maps results to Sentry platforms, and refines
via manifest file inspection (package.json, requirements.txt, etc.)
to detect specific frameworks like Django, Next.js, or Rails.
- Add get_languages() to GitHubBaseClient
- Create platform_detection module with language mapping, framework
detection, and main detect_platforms() orchestrator
- Add GET /api/0/organizations/{org}/repos/{repo_id}/platforms/ endpoint
- 51 tests covering unit logic and API integration
Refs VDY-15
Co-Authored-By: Claude <noreply@anthropic.com>
…tent Catch KeyError (missing "content" key), ValueError (invalid base64 via binascii.Error), and UnicodeDecodeError (binary file content) in addition to ApiError. These can occur when the GitHub API returns unexpected response shapes or binary file content. Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
…point base class Move repository resolution from the platforms endpoint into a reusable base class. Use IntegrationProviderSlug.GITHUB constant instead of hardcoded "github" string for provider comparison. Co-Authored-By: Claude <noreply@anthropic.com>
The substring check `IntegrationProviderSlug.GITHUB not in repo.provider` incorrectly allowed GitHub Enterprise repos through since "github" is a substring of "integrations:github_enterprise". Switch to exact equality. Also assert full response shapes in endpoint tests. Co-Authored-By: Claude <noreply@anthropic.com>
Check organizations:integrations-github-platform-detection before serving platform detection results. Returns 404 when the flag is disabled to hide the endpoint from orgs not in the rollout. Co-Authored-By: Claude <noreply@anthropic.com>
GitHub contents API returns a JSON array (not dict) when a path resolves to a directory. Subscripting a list with a string key raises TypeError, which was not caught, causing a 500 instead of graceful fallback to None. Co-Authored-By: Claude <noreply@anthropic.com>
…etection endpoint
…zationRepositoryEndpoint
45a50d5 to
59ed13c
Compare
…k definitions (#109700) ## Summary - Refactor flat framework detection into composable `FrameworkDef` / `DetectorRule` system - Add three signal types: `path` (config file existence), `match_content` (regex on file content), `match_package` (dependency lookup in parsed manifests) - Add `every` (AND) and `some` (OR) rule composition - Add priority ranking (`sort` field) and supersession (e.g. Next.js supersedes React) - Refactor file content fetching into a single batch pass to minimize API calls No behavior change for existing detections, but the architecture now supports easy addition of new frameworks as data-only entries. **Stack:** - [PR 1](#109699): Core detection + API endpoint - **PR 2 (this):** Composable framework definitions refactor - [PR 3](#109701): Expanded coverage (98% of picker platforms) --------- Co-authored-by: Claude <noreply@anthropic.com>
…109701) Expand the composable framework detection system to cover 97/100 (97%) of selectable platforms in the picker, up from the ~36 platforms in PR 2. **Infrastructure additions:** - `match_ext` + `match_content` combo rules -- find files by extension, then search content (needed for .csproj inspection where filenames vary) - `_NON_SELECTABLE_PLATFORMS` filter -- platforms detected internally for ranking (e.g. preventing WordPress from being misidentified as Symfony) but not shown to users because they lack onboarding docs or map to other platforms (perl, php-wordpress, swift) - Dual `base_platform` entries for Android (java + kotlin) so Kotlin-first projects are correctly detected **61 new framework definitions:** - **JavaScript (22):** astro, gatsby, sveltekit, solidstart, solid, ember, tanstackstart-react, react-router, react-native, electron, capacitor, ionic, cordova, node, nestjs, fastify, connect, hapi, awslambda, gcpfunctions, azurefunctions, cloudflare-workers, cloudflare-pages - **Python (13):** aiohttp, bottle, falcon, pyramid, quart, sanic, tryton, chalice, asgi, wsgi, awslambda, gcpfunctions, rq - **Go (4):** fasthttp, iris, negroni - **Java (2):** log4j2, logback - **Ruby (1):** rack - **PHP (2):** wordpress (non-selectable), symfony - **Dart (1):** flutter - **Swift (1):** apple-macos - **Native (1):** native-qt - **.NET (7):** maui, wpf, winforms, xamarin, aspnet, awslambda, gcpfunctions - **Mobile/Gaming (5):** unity, android, dotnet-aspnetcore, unreal, godot - **Other (3):** bun, deno, PowerShell (base platform) **Only 3 platforms remain undetectable:** `go-http` (stdlib net/http, no manifest signal), `minidump` (crash dump format, not a project type), and `python-serverless` (too generic, overlaps with awslambda/gcpfunctions). **Note on `go-http`:** Plain Go repos now intentionally return `go` as the base platform instead of `go-http`. There is no reliable way to distinguish a net/http project from any other Go project without a framework dependency, so the generic `go` platform is the correct fallback. **Stack:** - [PR 1](#109699): Core detection + API endpoint - [PR 2](#109700): Composable framework definitions refactor - **PR 3 (this):** Expanded coverage (97% of picker platforms) --------- Co-authored-by: Claude <noreply@anthropic.com>
Summary
get_languages()method to GitHub API client to fetch repository language statisticsplatform_detectionmodule with language-to-platform mapping, framework detection from manifest files (package.json, requirements.txt, pyproject.toml, Pipfile, Gemfile, composer.json, build.gradle, pom.xml, go.mod), and confidence scoringGET /api/0/organizations/{org}/repos/{repo_id}/platforms/to expose detected platformsThis is the foundation for automatic platform detection from GitHub repositories to streamline onboarding. Part 1 of 3.
Stack:
Test plan