feat(context engine): New task to generate project summaries for context engine#108760
feat(context engine): New task to generate project summaries for context engine#108760Mihir-Mavalankar merged 14 commits intomasterfrom
Conversation
| match=Entity("outcomes"), | ||
| select=[ | ||
| Column("project_id"), | ||
| Column("category"), | ||
| Function("sum", [Column("quantity")], "total"), | ||
| ], | ||
| where=[ | ||
| Condition(Column("timestamp"), Op.GTE, start), | ||
| Condition(Column("timestamp"), Op.LT, end), | ||
| Condition(Column("org_id"), Op.EQ, org_id), | ||
| Condition(Column("project_id"), Op.IN, project_ids), | ||
| Condition(Column("outcome"), Op.EQ, Outcome.ACCEPTED), | ||
| Condition( | ||
| Column("category"), | ||
| Op.IN, | ||
| [*DataCategory.error_categories(), DataCategory.TRANSACTION], | ||
| ), | ||
| ], | ||
| groupby=[Column("project_id"), Column("category")], | ||
| granularity=Granularity(3600 * 24), | ||
| limit=Limit(10000), | ||
| ) |
There was a problem hiding this comment.
I think there's a function that does this we can reuse? project_event_counts_for_organization maybe?
There was a problem hiding this comment.
That function requires a OrganizationReportContext object. It also does replays.
I think it's different enough where if we want to reuse it we will have to refactor quite a bit. That would mean worrying about how is affects other callers of that function
There was a problem hiding this comment.
aren't we trying to get totals? We can just have granularity equal the total time range so you don't have to manually add them below? Looks like we're doing a week? comment specifically about like 249
There was a problem hiding this comment.
Changing it to the full time range.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
…ext engine (#108760) ## PR details + Add index_org_project_knowledge instrumented task that assembles project metadata (error/transaction counts, top transactions, top span ops, instrumentation flags) and POSTs it to the Seer POST /v1/automation/explorer/index/org-project-knowledge endpoint to generate LLM summaries and embeddings + Extract helper functions into src/sentry/seer/explorer/context_engine_utils.py; outcomes dataset query batches all org projects in a single Snuba call, EAP queries for transactions and span ops also batch all projects in a single request. + Filter to high-volume projects only (≥1000 total events in the last 7 days) before calling Seer to avoid indexing low-signal projects + Add test coverage for both the task and all helper functions
…ext engine (#108760) ## PR details + Add index_org_project_knowledge instrumented task that assembles project metadata (error/transaction counts, top transactions, top span ops, instrumentation flags) and POSTs it to the Seer POST /v1/automation/explorer/index/org-project-knowledge endpoint to generate LLM summaries and embeddings + Extract helper functions into src/sentry/seer/explorer/context_engine_utils.py; outcomes dataset query batches all org projects in a single Snuba call, EAP queries for transactions and span ops also batch all projects in a single request. + Filter to high-volume projects only (≥1000 total events in the last 7 days) before calling Seer to avoid indexing low-signal projects + Add test coverage for both the task and all helper functions
PR details