feat(preprod): Track distribution state on PreprodArtifact#109062
feat(preprod): Track distribution state on PreprodArtifact#109062runningcode merged 9 commits intomasterfrom
Conversation
|
This PR has a migration; here is the generated SQL for for --
-- Add field distribution_state to preprodartifact
--
ALTER TABLE "sentry_preprodartifact" ADD COLUMN "distribution_state" integer NULL CHECK ("distribution_state" >= 0);
--
-- Add field distribution_skip_reason to preprodartifact
--
ALTER TABLE "sentry_preprodartifact" ADD COLUMN "distribution_skip_reason" varchar(32) NULL; |
src/sentry/preprod/api/endpoints/organization_preprod_list_builds.py
Outdated
Show resolved
Hide resolved
src/sentry/preprod/api/endpoints/project_preprod_artifact_update.py
Outdated
Show resolved
Hide resolved
src/sentry/preprod/api/endpoints/project_preprod_artifact_update.py
Outdated
Show resolved
Hide resolved
0986852 to
ca28903
Compare
…842) (#109075) ## Summary - Add `installable_app_error_code` and `installable_app_error_message` columns to `PreprodArtifact` - Add `InstallableAppErrorCode` enum (UNKNOWN, NO_QUOTA, SKIPPED) following the existing `ErrorCode` pattern on `PreprodArtifact` - Migration: `0027_add_distribution_state_fields` Uses an error-code model instead of a state field so that `installable_app_file_id` implicitly encodes success, avoiding ambiguous state combinations: - `file_id` set + no error → success - no `file_id` + error code set → skipped/failed - no `file_id` + no error → pending / not yet determined Split out from #109062 to land the schema change independently. EME-842
Record PENDING/NOT_RAN/COMPLETED in the update endpoint instead of discarding the skip reason. Set COMPLETED when installable file is assembled. Expose state/skip_reason in DistributionInfo API response. Exclude NOT_RAN builds from list-builds and builds endpoints. Only transition to COMPLETED from PENDING and guard against retry overwrites.
ca28903 to
e6abc19
Compare
src/sentry/preprod/api/endpoints/project_preprod_artifact_update.py
Outdated
Show resolved
Hide resolved
…(EME-842) Replace the distribution state machine (PENDING/COMPLETED/NOT_RAN) with error-only tracking via InstallableAppErrorCode. Success is now implicit (installable_app_file_id is set), and we only record why distribution was skipped (NO_QUOTA, SKIPPED, PROCESSING_ERROR).
src/sentry/preprod/api/models/project_preprod_build_details_models.py
Outdated
Show resolved
Hide resolved
src/sentry/preprod/api/models/project_preprod_build_details_models.py
Outdated
Show resolved
Hide resolved
…842) (#109075) ## Summary - Add `installable_app_error_code` and `installable_app_error_message` columns to `PreprodArtifact` - Add `InstallableAppErrorCode` enum (UNKNOWN, NO_QUOTA, SKIPPED) following the existing `ErrorCode` pattern on `PreprodArtifact` - Migration: `0027_add_distribution_state_fields` Uses an error-code model instead of a state field so that `installable_app_file_id` implicitly encodes success, avoiding ambiguous state combinations: - `file_id` set + no error → success - no `file_id` + error code set → skipped/failed - no `file_id` + no error → pending / not yet determined Split out from #109062 to land the schema change independently. EME-842
… test (EME-842) Clear installable_app_error_code and installable_app_error_message in reset_artifact_data so reruns can re-evaluate distribution eligibility. Move error fields test from the builds list endpoint (which excludes error-code artifacts) to the build details endpoint where the fields are actually returned.
src/sentry/preprod/api/endpoints/project_preprod_artifact_update.py
Outdated
Show resolved
Hide resolved
src/sentry/preprod/api/endpoints/project_preprod_artifact_update.py
Outdated
Show resolved
Hide resolved
…842) Restructure the distribution guard so explicit error codes from launchpad always overwrite the current state (enabling reruns to update a previous decision), while the implicit should_run_distribution check only runs when no decision has been made yet. Revert the reset_artifact_data change since launchpad can now overwrite state directly on rerun callbacks.
| "installable_app_error_message", | ||
| "date_updated", | ||
| ] | ||
| ) |
There was a problem hiding this comment.
can_run_distro, distro_skip_reason = and below seems correct.
| requested_features.append(PreprodFeature.BUILD_DISTRIBUTION) | ||
| # Always accept explicit distribution state from launchpad so reruns | ||
| # can overwrite a previous decision. | ||
| if "installable_app_error_code" in data: |
There was a problem hiding this comment.
I don't think this is the correct endpoint to handle this.
The size one for example is handled here:
https://github.com/getsentry/sentry/blob/5205f1bcbc54415de4a8c371bea9cd754c3321e8/src/sentry/preprod/api/endpoints/project_preprod_size.py
project_preprod_artifact_update() is for the initial processing - not for uploading size/distro date or for updating the errors for the.
The flow is like:
monolith launchpad
assemble()
task()
kafka -----> initial processing
project_preprod_artifact_update <----- (returns req features)
compute_size
<----- upload size json
compute_distro
<----- upload distro build
for success and:
monolith launchpad
assemble()
task()
kafka -----> initial processing
project_preprod_artifact_update <----- (returns req features)
compute_size
ProjectPreprodSizeEndpoint<----- set size error
compute_distro
<----- ?
for failure.
| ) | ||
| .prefetch_related("preprodartifactsizemetrics_set") | ||
| .filter(project_id__in=project_ids, date_added__gte=cutoff) | ||
| .exclude(installable_app_error_code__isnull=False) |
There was a problem hiding this comment.
Same here - I'm not sure we want to hide these
chromy
left a comment
There was a problem hiding this comment.
Left some comments, maybe we could start just with NOT_RAN and leave the other two parts:
- updating list builds
- adding an endpoint for updating the distro error_code / error_message
to follow up PRs? I think keeping the PRs smaller might help.
|
Thanks for the diagrams! I will just do the |
src/sentry/preprod/api/endpoints/project_preprod_artifact_update.py
Outdated
Show resolved
Hide resolved
d08ef5c to
7e21485
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable autofix in the Cursor dashboard.
| "installable_app_error_message", | ||
| "date_updated", | ||
| ] | ||
| ) |
There was a problem hiding this comment.
Retry guard incomplete, may set erroneous error code
Medium Severity
The guard if head_artifact.installable_app_error_code is None: only prevents re-evaluation when distribution was previously skipped (error code set). When distribution was previously allowed, installable_app_error_code remains None, so on a launchpad retry the check re-runs. If conditions changed between calls (e.g., quota consumed by the first distribution request), an error code like NO_QUOTA gets written to an artifact that already had distribution initiated — creating an inconsistent state where the build details API shows both a successful distribution (is_installable=True) and an error code simultaneously.
7e21485 to
46b4112
Compare
| if distro_skip_reason == "quota": | ||
| error_code = PreprodArtifact.InstallableAppErrorCode.NO_QUOTA | ||
| error_message = "Distribution quota exceeded" | ||
| elif distro_skip_reason == "disabled": | ||
| error_code = PreprodArtifact.InstallableAppErrorCode.SKIPPED | ||
| error_message = "Distribution disabled for this project" | ||
| else: | ||
| error_code = PreprodArtifact.InstallableAppErrorCode.SKIPPED | ||
| error_message = "Distribution filtered out by project settings" | ||
|
|
There was a problem hiding this comment.
Bug: When an artifact is rerun, the reset_artifact_data function does not clear distribution-specific error fields, leaving stale error data after a successful retry.
Severity: MEDIUM
Suggested Fix
Update the reset_artifact_data function in preprod_artifact_rerun_analysis.py to also set installable_app_error_code and installable_app_error_message to None. This will ensure that stale error information is cleared before a rerun attempt.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: src/sentry/preprod/api/endpoints/project_preprod_artifact_update.py#L439-L448
Potential issue: When an artifact distribution fails and is subsequently rerun, the
function `reset_artifact_data` is called. This function clears general artifact error
fields like `error_code` and `error_message`, but it fails to clear
distribution-specific error fields, namely `installable_app_error_code` and
`installable_app_error_message`. If the rerun is successful, the distribution task sets
`installable_app_file_id`, but the stale error fields from the previous failed attempt
persist. This results in an inconsistent state where the artifact appears to have both
succeeded and failed simultaneously.
There was a problem hiding this comment.
Yes, might be good to add this to:
Remove listing endpoint filters and explicit error code handling from the update endpoint. The update endpoint now only evaluates distribution eligibility (should_run_distribution) and records skip reasons. Error reporting from launchpad and listing filters will be added in follow-up PRs via a dedicated endpoint.
46b4112 to
89f24ad
Compare
Add a dedicated endpoint (`PUT
.../files/preprodartifacts/{id}/distribution/`) for launchpad to report
distribution processing errors back to the monolith. This mirrors the
existing `ProjectPreprodSizeEndpoint` pattern.
When launchpad encounters a distribution failure (unsupported artifact
type, invalid code signature, simulator build), it needs a way to set
`installable_app_error_code` and `installable_app_error_message` on the
artifact so the frontend can display the reason. Previously, the only
option was the general `update` endpoint which marks the entire artifact
as failed — but distribution errors shouldn't affect the artifact's
overall state.
Follow-up to #109062. The launchpad side that calls this endpoint is in
getsentry/launchpad#567.
Refs EME-842, EME-422
---------
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: getsantry[bot] <66042841+getsantry[bot]@users.noreply.github.com>
Add a dedicated endpoint (`PUT
.../files/preprodartifacts/{id}/distribution/`) for launchpad to report
distribution processing errors back to the monolith. This mirrors the
existing `ProjectPreprodSizeEndpoint` pattern.
When launchpad encounters a distribution failure (unsupported artifact
type, invalid code signature, simulator build), it needs a way to set
`installable_app_error_code` and `installable_app_error_message` on the
artifact so the frontend can display the reason. Previously, the only
option was the general `update` endpoint which marks the entire artifact
as failed — but distribution errors shouldn't affect the artifact's
overall state.
Follow-up to #109062. The launchpad side that calls this endpoint is in
getsentry/launchpad#567.
Refs EME-842, EME-422
---------
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: getsantry[bot] <66042841+getsantry[bot]@users.noreply.github.com>



Mirror the existing size analysis
NOT_RANpattern for build distribution. When the update endpoint evaluates distribution eligibility, it records why distribution was skipped (quota, feature disabled, filtered) so the frontend can display the reason in build details.Changes:
installable_app_error_codeandinstallable_app_error_messagefields toPreprodArtifactInstallableAppErrorCodeenum (UNKNOWN,NO_QUOTA,SKIPPED,PROCESSING_ERROR)should_run_distributionand records the skip reason (mirroring theshould_run_sizepattern)DistributionInfoin the build details API exposeserror_codeanderror_messageScoped out (follow-up PRs):
ProjectPreprodSizeEndpoint)Refs EME-842