Skip to content

feat(configuration-issues): Making processing_errors queryable#109884

Merged
Abdkhan14 merged 7 commits intomasterfrom
abdk/querying-processing-errors
Mar 6, 2026
Merged

feat(configuration-issues): Making processing_errors queryable#109884
Abdkhan14 merged 7 commits intomasterfrom
abdk/querying-processing-errors

Conversation

@Abdkhan14
Copy link
Contributor

@Abdkhan14 Abdkhan14 commented Mar 4, 2026

Summary

Registers processing_errors as a queryable EAP dataset so the events and events-stats APIs can query processing error data (e.g. for issue details charts). Used established patterns for other queryable datasets as a guide.

  • search/eap/processing_errors/ module with column, attribute and aggregate function definitions
  • New ProcessingErrors RPC dataset class in snuba/processing_errors_rpc.py supporting table, timeseries, and top-events queries
  • Registers the dataset in types.py, constants.py, utils.py and enables it in the organization_events and organization_events_stats endpoints (including top-events support)
  • Exposes attribute mappings via the attribute mappings API
  • Adds ProcessingErrorTestCase mixin with helpers to create and store processing error items

Test Plan

  • tests/sentry/search/eap/test_processing_errors.py — validates query parsing (filters, negation, in-filter, AND/OR, empty)
  • tests/snuba/api/endpoints/test_organization_events_stats_processing_errors.py — validates count, zerofill, top events grouped by error_type, filtering, and empty top events

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Mar 4, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Mar 4, 2026

Backend Test Failures

Failures on d51fdb9 in this run:

tests/sentry/api/endpoints/test_organization_attribute_mappings.py::OrganizationAttributeMappingsEndpointTest::test_get_all_mappingslog
tests/sentry/api/endpoints/test_organization_attribute_mappings.py:34: in test_get_all_mappings
    assert types_present == expected_types
E   AssertionError: assert {'logs', 'occ...metrics', ...} == {'logs', 'occ...time_results'}
E     
E     Extra items in the left set:
E     'processing_errors'
E     
E     Full diff:
E       {
E           'logs',
E           'occurrences',
E     +     'processing_errors',
E           'profiles',
E           'spans',
E           'tracemetrics',
E           'uptime_results',
E       }

@Abdkhan14 Abdkhan14 marked this pull request as ready for review March 4, 2026 19:32
@Abdkhan14 Abdkhan14 requested review from a team as code owners March 4, 2026 19:32
@Abdkhan14 Abdkhan14 requested a review from wedamija March 4, 2026 19:33
@Abdkhan14
Copy link
Contributor Author

@wedamija I've added a todo for the received attr from #109851. Will follow up when your pr is merged.

Copy link
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Parameter additional_queries silently dropped in table query
    • ProcessingErrors.run_table_query now forwards additional_queries into rpc_dataset_common.TableQuery, so caller-provided additional filters are no longer dropped.

Create PR

Or push these changes by commenting:

@cursor push b11c4835be
Preview (b11c4835be)
diff --git a/src/sentry/snuba/processing_errors_rpc.py b/src/sentry/snuba/processing_errors_rpc.py
--- a/src/sentry/snuba/processing_errors_rpc.py
+++ b/src/sentry/snuba/processing_errors_rpc.py
@@ -46,6 +46,7 @@
                 sampling_mode=sampling_mode,
                 resolver=search_resolver or cls.get_resolver(params, config),
                 page_token=page_token,
+                additional_queries=additional_queries,
             ),
             params.debug,
         )
This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.

@Abdkhan14 Abdkhan14 requested a review from evanpurkhiser March 5, 2026 19:40
Copy link
Member

@wedamija wedamija left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The received field pr has merged, although idk if it even needs to be queryable. Doesn't hurt to add it though.

Lgtm, just a couple of nits

Comment on lines +4247 to +4252
from datetime import datetime, timedelta, timezone
from uuid import uuid4

from google.protobuf.timestamp_pb2 import Timestamp
from sentry_protos.snuba.v1.request_common_pb2 import TraceItemType
from sentry_protos.snuba.v1.trace_item_pb2 import TraceItem
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's import up the top

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done

Comment on lines +4303 to +4316
def store_processing_errors(self, processing_errors):
"""Store processing errors in the EAP dataset."""
import requests
from django.conf import settings

files = {
f"processing_error_{i}": item.SerializeToString()
for i, item in enumerate(processing_errors)
}
response = requests.post(
settings.SENTRY_SNUBA + EAP_ITEMS_INSERT_ENDPOINT,
files=files,
)
assert response.status_code == 200
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a method for storing these in https://github.com/getsentry/sentry/pull/109851/changes#diff-044678ee783c164a8acca43b4fa4a97e7402d231129865acf6a253e1670e922f, would be good to see if it works for your usecase so that we don't duplicate work

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wedamija store_processing_errors is replaceable by produce_and_store_eap_items, however the latter tests producer correctness, requires passing producer specific params that seem redundant for API tests that only need the data to exist for testing querying behaviour. Also requires more boilterplate for forming the event_data and errors schema like here: to be able to use alongside produce_and_store_eap_items. This is consistent with the rest of the test classes like uptime, profile functions, occurrences in the file.

Not too strong of an opinion though, lmk if you want me to just replace it haha

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just use store_eap_items though. I'm going to factor out all the store_profile_functions, store_trace_metrics, etc etc in some other pr. It's unnecessary to create new functions for each item type, the file name passed isn't used. I think AI is just replicating some pattern

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right actually, getting rid of store_processing_errors

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wedamija this is done ^

Copy link
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

assert response.status_code == 200, response.content
assert [attrs for time, attrs in response.data["data"]] == [
[{"count": count}] for count in event_counts
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_count expects 6 time buckets, but 7 exist

Medium Severity

The test_count test asserts against 6 time buckets (event_counts = [6, 0, 6, 3, 0, 3]), but test_zerofill with the identical time range and interval expects 7 buckets ([[{"count": 0}]] * 7). The test_top_events_group_by_error_type also implicitly acknowledges 7 buckets by slicing [0:6]. The event_counts list in test_count is missing a trailing zero-count entry for the 7th bucket, so the assertion will fail when the response contains 7 data points.

Additional Locations (1)

Fix in Cursor Fix in Web

Copy link
Member

@wedamija wedamija left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, lgtm!

@Abdkhan14 Abdkhan14 merged commit d68c19c into master Mar 6, 2026
77 checks passed
@Abdkhan14 Abdkhan14 deleted the abdk/querying-processing-errors branch March 6, 2026 18:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants