feat(preprod): Add Size Analysis detector#108208
Conversation
|
This PR has a migration; here is the generated SQL for for --
-- Create model SizeAnalysisSubscription
--
CREATE TABLE "sentry_sizeanalysissubscription" ("id" bigint NOT NULL PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY, "date_updated" timestamp with time zone NOT NULL, "date_added" timestamp with time zone NOT NULL, "project_id" bigint NOT NULL);
ALTER TABLE "sentry_sizeanalysissubscription" ADD CONSTRAINT "sentry_sizeanalysiss_project_id_41e3355e_fk_sentry_pr" FOREIGN KEY ("project_id") REFERENCES "sentry_project" ("id") DEFERRABLE INITIALLY DEFERRED NOT VALID;
ALTER TABLE "sentry_sizeanalysissubscription" VALIDATE CONSTRAINT "sentry_sizeanalysiss_project_id_41e3355e_fk_sentry_pr";
CREATE INDEX CONCURRENTLY "sentry_sizeanalysissubscription_project_id_41e3355e" ON "sentry_sizeanalysissubscription" ("project_id"); |
a50e1e0 to
5317fb3
Compare
5317fb3 to
affc2c2
Compare
affc2c2 to
258f072
Compare
saponifi3d
left a comment
There was a problem hiding this comment.
the workflow engine code lgtm -- added a few notes for myself on abstractions that should probably live on the base class.
| class PreprodSizeAnalysisDetectorHandler( | ||
| BaseDetectorHandler[SizeAnalysisValue, SizeAnalysisEvaluation] | ||
| ): | ||
| def evaluate_impl(self, data_packet: SizeAnalysisDataPacket) -> GroupedDetectorEvaluationResult: |
There was a problem hiding this comment.
🙏 -- this is more or less what the eventual base abstraction will look like too. i'll update this once we start working on the non-Stateful abstraction.
| def _evaluate_conditions( | ||
| self, value: SizeAnalysisEvaluation | ||
| ) -> tuple[ProcessedDataConditionGroup | None, DetectorPriorityLevel | None]: | ||
| if not self.condition_group: | ||
| metrics.incr("workflow_engine.detector.skipping_invalid_condition_group") | ||
| return None, None | ||
|
|
||
| condition_evaluation, _ = process_data_condition_group(self.condition_group, value) | ||
| if not condition_evaluation.logic_result.triggered: | ||
| return None, None | ||
|
|
||
| priorities = [ | ||
| condition_result.result | ||
| for condition_result in condition_evaluation.condition_results | ||
| if isinstance(condition_result.result, DetectorPriorityLevel) | ||
| ] | ||
| if not priorities: | ||
| return None, None | ||
|
|
||
| return condition_evaluation, max(priorities) |
There was a problem hiding this comment.
note for myself -- this feels like another top level abstraction we should pull out as part of the ABC
| SizeAnalysisEvaluation: TypeAlias = int | float | ||
|
|
||
|
|
||
| class PreprodSizeAnalysisDetectorHandler( |
There was a problem hiding this comment.
outside of evaluate_impl and _evaluate_conditions -- was there anything else you felt like should live in the base class? i can take a look at updating everything there.
258f072 to
92722b1
Compare
| event_data = { | ||
| "event_id": uuid4().hex, | ||
| "project_id": self.detector.project_id, | ||
| "platform": "android", |
There was a problem hiding this comment.
Bug: The PreprodSizeAnalysisDetectorHandler hardcodes the platform as "android", which will cause incorrect platform attribution for issues generated from iOS/Apple builds.
Severity: MEDIUM
Suggested Fix
Modify the data pipeline to include platform information in the DataPacket[SizeAnalysisValue] sent to the detector. This could involve adding a platform field to the SizeAnalysisValue TypedDict or looking up the artifact type from an identifier within the packet. The handler should then use this dynamic platform value instead of the hardcoded "android" string.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: src/sentry/preprod/size_analysis/grouptype.py#L158
Potential issue: The `create_occurrence` method in the new
`PreprodSizeAnalysisDetectorHandler` hardcodes the `platform` to "android". This is
incorrect because the platform should be determined dynamically based on the artifact
type, as is done in the existing `diff_to_occurrence` function. The data packet
(`SizeAnalysisValue`) passed to the handler does not currently contain the necessary
artifact type or platform information. If this detector is activated for iOS builds,
which use the `XCARCHIVE` artifact type, any generated issues will be incorrectly
attributed to the Android platform, impacting platform-specific filtering and metrics.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
| return self._extract_head(data_packet) - self._extract_base(data_packet) | ||
| case "relative_diff": | ||
| base = self._extract_base(data_packet) | ||
| return (self._extract_head(data_packet) - base) / base |
There was a problem hiding this comment.
Division by zero when base size is zero
Medium Severity
In extract_value, the relative_diff case computes (head - base) / base. If base is 0 (e.g., a new measurement category that previously had no size), this raises a ZeroDivisionError. The _extract_base method only validates that the value is not None, so a zero value passes validation and reaches the division.


Add PreprodSizeAnalysisGroupType (type_id=11003) with detector handler
and validators that integrate with the workflow engine to emit issues
when build size thresholds are exceeded.
comparisons and produces IssueOccurrences
PRs:
Design doc
This stack of PRs is missing a handful of important features:
but it works end to end.