Skip to content

feat(context engine): New task to generate project summaries for context engine#108760

Merged
Mihir-Mavalankar merged 14 commits intomasterfrom
invoke-proj-index-creation
Feb 23, 2026
Merged

feat(context engine): New task to generate project summaries for context engine#108760
Mihir-Mavalankar merged 14 commits intomasterfrom
invoke-proj-index-creation

Conversation

@Mihir-Mavalankar
Copy link
Contributor

PR details

  • Add index_org_project_knowledge instrumented task that assembles project metadata (error/transaction counts, top transactions, top span ops, instrumentation flags) and POSTs it to the Seer POST /v1/automation/explorer/index/org-project-knowledge endpoint to generate LLM summaries and embeddings
  • Extract helper functions into src/sentry/seer/explorer/context_engine_utils.py; outcomes dataset query batches all org projects in a single Snuba call, EAP queries for transactions and span ops also batch all projects in a single request.
  • Filter to high-volume projects only (≥1000 total events in the last 7 days) before calling Seer to avoid indexing low-signal projects
  • Add test coverage for both the task and all helper functions

@Mihir-Mavalankar Mihir-Mavalankar self-assigned this Feb 20, 2026
@Mihir-Mavalankar Mihir-Mavalankar requested a review from a team as a code owner February 20, 2026 21:51
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Feb 20, 2026
Comment on lines +166 to +187
match=Entity("outcomes"),
select=[
Column("project_id"),
Column("category"),
Function("sum", [Column("quantity")], "total"),
],
where=[
Condition(Column("timestamp"), Op.GTE, start),
Condition(Column("timestamp"), Op.LT, end),
Condition(Column("org_id"), Op.EQ, org_id),
Condition(Column("project_id"), Op.IN, project_ids),
Condition(Column("outcome"), Op.EQ, Outcome.ACCEPTED),
Condition(
Column("category"),
Op.IN,
[*DataCategory.error_categories(), DataCategory.TRANSACTION],
),
],
groupby=[Column("project_id"), Column("category")],
granularity=Granularity(3600 * 24),
limit=Limit(10000),
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a function that does this we can reuse? project_event_counts_for_organization maybe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That function requires a OrganizationReportContext object. It also does replays.
I think it's different enough where if we want to reuse it we will have to refactor quite a bit. That would mean worrying about how is affects other callers of that function

Copy link
Member

@shruthilayaj shruthilayaj Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aren't we trying to get totals? We can just have granularity equal the total time range so you don't have to manually add them below? Looks like we're doing a week? comment specifically about like 249

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing it to the full time range.

Copy link
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

@Mihir-Mavalankar Mihir-Mavalankar merged commit 0c73c75 into master Feb 23, 2026
100 of 101 checks passed
@Mihir-Mavalankar Mihir-Mavalankar deleted the invoke-proj-index-creation branch February 23, 2026 21:58
mchen-sentry pushed a commit that referenced this pull request Feb 24, 2026
…ext engine (#108760)

## PR details
+ Add index_org_project_knowledge instrumented task that assembles
project metadata (error/transaction counts, top transactions, top span
ops, instrumentation flags) and POSTs it to the Seer POST
/v1/automation/explorer/index/org-project-knowledge endpoint to generate
LLM summaries and embeddings
+ Extract helper functions into
src/sentry/seer/explorer/context_engine_utils.py; outcomes dataset query
batches all org projects in a single Snuba call, EAP queries for
transactions and span ops also batch all projects in a single request.
+ Filter to high-volume projects only (≥1000 total events in the last 7
days) before calling Seer to avoid indexing low-signal projects
+ Add test coverage for both the task and all helper functions
wedamija pushed a commit that referenced this pull request Feb 24, 2026
…ext engine (#108760)

## PR details
+ Add index_org_project_knowledge instrumented task that assembles
project metadata (error/transaction counts, top transactions, top span
ops, instrumentation flags) and POSTs it to the Seer POST
/v1/automation/explorer/index/org-project-knowledge endpoint to generate
LLM summaries and embeddings
+ Extract helper functions into
src/sentry/seer/explorer/context_engine_utils.py; outcomes dataset query
batches all org projects in a single Snuba call, EAP queries for
transactions and span ops also batch all projects in a single request.
+ Filter to high-volume projects only (≥1000 total events in the last 7
days) before calling Seer to avoid indexing low-signal projects
+ Add test coverage for both the task and all helper functions
@github-actions github-actions bot locked and limited conversation to collaborators Mar 11, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants