⚡ Bolt: Implement query result caching for email search#421
Conversation
- Add `clear_query_cache` to `EnhancedCachingManager` - Cache results in `DatabaseManager.search_emails_with_limit` - Invalidate cache on email create/update/delete - Add integration test for search caching - Fix regression in `test_database_search_perf.py` Co-authored-by: MasumRab <8943353+MasumRab@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|
|
Reviewer's GuideImplements query result caching for email searches by adding a query cache layer around DatabaseManager.search_emails_with_limit, ensuring cache invalidation on email mutations, and extending the caching manager API plus tests to cover the new behavior and performance characteristics. Sequence diagram for cached search_emails_with_limit query flowsequenceDiagram
participant Client
participant DatabaseManager
participant EnhancedCachingManager
participant EmailStore
Client->>DatabaseManager: search_emails_with_limit(search_term, limit)
DatabaseManager->>DatabaseManager: build cache_key = search:lower(search_term):limit
DatabaseManager->>EnhancedCachingManager: get_query_result(cache_key)
alt cache hit
EnhancedCachingManager-->>DatabaseManager: cached_result
DatabaseManager-->>Client: cached_result
else cache miss
EnhancedCachingManager-->>DatabaseManager: None
DatabaseManager->>EmailStore: get_emails(limit, offset) and apply search filtering
EmailStore-->>DatabaseManager: filtered_emails
DatabaseManager->>DatabaseManager: results = _add_category_details(filtered_emails)
DatabaseManager->>EnhancedCachingManager: put_query_result(cache_key, results)
DatabaseManager-->>Client: results
end
Sequence diagram for query cache invalidation on email create and updatesequenceDiagram
actor Client
participant DatabaseManager
participant EnhancedCachingManager
participant EmailStore
Client->>DatabaseManager: create_email(email_data)
DatabaseManager->>EmailStore: insert email
EmailStore-->>DatabaseManager: new_id, light_email_record, heavy_data
DatabaseManager->>EnhancedCachingManager: put_email_content(new_id, heavy_data)
DatabaseManager->>DatabaseManager: _sorted_emails_cache = None
DatabaseManager->>EnhancedCachingManager: clear_query_cache()
DatabaseManager-->>Client: _add_category_details(light_email_record)
Client->>DatabaseManager: update_email_by_message_id(message_id, updates)
DatabaseManager->>EmailStore: update email by message_id
EmailStore-->>DatabaseManager: email_id, email_to_update
DatabaseManager->>EnhancedCachingManager: invalidate_email_record(email_id)
DatabaseManager->>DatabaseManager: _sorted_emails_cache = None
DatabaseManager->>EnhancedCachingManager: clear_query_cache()
DatabaseManager-->>Client: _add_category_details(email_to_update)
Updated class diagram for DatabaseManager and EnhancedCachingManager caching behaviorclassDiagram
class DatabaseManager {
caching_manager: EnhancedCachingManager
_sorted_emails_cache
create_email(email_data: Dict)
update_email_by_message_id(message_id: str, updates: Dict)
update_email(email_id: int, email_data: Dict)
search_emails_with_limit(search_term: str, limit: int)
get_emails(limit: int, offset: int)
_add_category_details(email: Dict)
}
class EnhancedCachingManager {
email_record_cache
query_cache
put_email_content(email_id: int, heavy_data)
get_email_record(email_id: int)
put_email_record(email_id: int, email_record: Dict)
invalidate_email_record(email_id: int)
get_query_result(query_key: str)
put_query_result(query_key: str, result: List)
invalidate_query_result(query_key: str)
clear_query_cache()
clear_all_caches()
}
DatabaseManager --> EnhancedCachingManager: uses for email and query caching
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
🤖 Hi @MasumRab, I've received your request, and I'm working on it now! You can track my progress in the logs for more details. |
WalkthroughQuery result caching is implemented for the Changes
Sequence DiagramsequenceDiagram
participant Client
participant DatabaseManager
participant EnhancedCachingManager
participant Database
rect rgba(100, 200, 150, 0.5)
Note over Client,Database: First Search (Cache Miss)
Client->>DatabaseManager: search_emails_with_limit(term, limit)
DatabaseManager->>EnhancedCachingManager: get_query_result(cache_key)
EnhancedCachingManager-->>DatabaseManager: None (cache miss)
DatabaseManager->>Database: Execute search query
Database-->>DatabaseManager: Results
DatabaseManager->>EnhancedCachingManager: put_query_result(cache_key, results)
DatabaseManager-->>Client: Results
end
rect rgba(150, 150, 200, 0.5)
Note over Client,Database: Second Search (Cache Hit)
Client->>DatabaseManager: search_emails_with_limit(term, limit)
DatabaseManager->>EnhancedCachingManager: get_query_result(cache_key)
EnhancedCachingManager-->>DatabaseManager: Cached Results (hit)
DatabaseManager-->>Client: Results
end
rect rgba(200, 150, 100, 0.5)
Note over Client,Database: Update Email (Cache Invalidation)
Client->>DatabaseManager: update_email(...)
DatabaseManager->>Database: Update operation
DatabaseManager->>EnhancedCachingManager: clear_query_cache()
EnhancedCachingManager->>EnhancedCachingManager: Clear query cache
DatabaseManager-->>Client: Confirmation
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 Pylint (4.0.4)src/core/database.pysrc/core/enhanced_caching.pytests/core/test_database_search_perf.py
Warning Review ran into problems🔥 ProblemsErrors were encountered while retrieving linked issues. Errors (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
🤖 I'm sorry @MasumRab, but I was unable to process your request. Please see the logs for more details. |
|
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- The query cache key only includes the normalized search term and limit; if
search_emails_with_limitever adds filters (e.g., user/mailbox/category), consider including those in the key now to avoid subtle cache pollution later. - Logging a cache hit at
infolevel on every warm search may be noisy in production; consider switching this todebugor adding rate limiting if high-frequency queries are expected. EnhancedCachingManager.clear_all_cachesdoes not currently clear the newquery_cache; consider includingquery_cache.clear()there so the method semantics remain truly 'clear everything'.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The query cache key only includes the normalized search term and limit; if `search_emails_with_limit` ever adds filters (e.g., user/mailbox/category), consider including those in the key now to avoid subtle cache pollution later.
- Logging a cache hit at `info` level on every warm search may be noisy in production; consider switching this to `debug` or adding rate limiting if high-frequency queries are expected.
- `EnhancedCachingManager.clear_all_caches` does not currently clear the new `query_cache`; consider including `query_cache.clear()` there so the method semantics remain truly 'clear everything'.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/core/database.py`:
- Around line 735-741: The code currently returns cached_result directly from
caching_manager.get_query_result (keyed by cache_key built from search_term and
limit), which lets callers mutate the cached object; instead, return a shallow
copy before returning to callers — e.g., create a new list from cached_result
(and if the list contains mutable mapping items, shallow-copy each dict) so that
modifications by the caller do not mutate the cached data; update the branch
that handles the cache hit (the block that logs via logger.info and returns
cached_result) to return the copied data.
🧹 Nitpick comments (2)
src/core/database.py (1)
795-799: Caching implementation looks good, but shares mutable references.The caching of search results is correctly implemented. However, the cached
resultslist contains the same dictionary objects that are stored inemails_dataand various indexes (via_add_category_detailswhich modifies in-place). This extends the mutable data concern raised above—mutations to the underlying email data could inadvertently affect cached search results.If immutability is desired, consider caching deep copies or storing only email IDs in the query cache and reconstructing results on cache hit.
tests/core/test_search_query_cache.py (1)
19-57: Good integration test coverage for the caching lifecycle.The test effectively validates:
- Cache population on cold search
- Cache hit on warm search
- Cache invalidation after email update
- Re-caching with new search terms
The lowercase normalization is implicitly tested (searching "Test" creates key "search:test:10").
Consider adding additional test cases for:
- Different
limitvalues creating separate cache entries (e.g.,"search:test:10"vs"search:test:20")- Verifying that
create_emailalso invalidates the cache (not justupdate_email)`@pytest.mark.asyncio` async def test_search_cache_invalidated_on_create(db_manager): """Test that creating a new email invalidates the query cache.""" await db_manager._ensure_initialized() # Populate cache await db_manager.create_email({ "messageId": "msg-1", "subject": "First Email", "sender": "sender@example.com", }) results1 = await db_manager.search_emails_with_limit("First", limit=10) cache_key = "search:first:10" assert db_manager.caching_manager.get_query_result(cache_key) is not None # Create another email - should invalidate cache await db_manager.create_email({ "messageId": "msg-2", "subject": "Second Email", "sender": "sender@example.com", }) # Cache should be cleared assert db_manager.caching_manager.get_query_result(cache_key) is None
| # Check query cache | ||
| # Normalize search term to lower case for consistent caching | ||
| cache_key = f"search:{search_term.lower()}:{limit}" | ||
| cached_result = self.caching_manager.get_query_result(cache_key) | ||
| if cached_result is not None: | ||
| logger.info(f"Query cache hit for term: '{search_term}'") | ||
| return cached_result |
There was a problem hiding this comment.
Cached results returned by reference may allow callers to corrupt the cache.
The cached result is returned directly without copying. If a caller modifies the returned list or its contained dictionaries, those modifications will persist in the cache and affect subsequent cache hits. Consider returning a shallow copy.
🛠️ Proposed fix
# Check query cache
# Normalize search term to lower case for consistent caching
cache_key = f"search:{search_term.lower()}:{limit}"
cached_result = self.caching_manager.get_query_result(cache_key)
if cached_result is not None:
logger.info(f"Query cache hit for term: '{search_term}'")
- return cached_result
+ return [email.copy() for email in cached_result]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Check query cache | |
| # Normalize search term to lower case for consistent caching | |
| cache_key = f"search:{search_term.lower()}:{limit}" | |
| cached_result = self.caching_manager.get_query_result(cache_key) | |
| if cached_result is not None: | |
| logger.info(f"Query cache hit for term: '{search_term}'") | |
| return cached_result | |
| # Check query cache | |
| # Normalize search term to lower case for consistent caching | |
| cache_key = f"search:{search_term.lower()}:{limit}" | |
| cached_result = self.caching_manager.get_query_result(cache_key) | |
| if cached_result is not None: | |
| logger.info(f"Query cache hit for term: '{search_term}'") | |
| return [email.copy() for email in cached_result] |
🤖 Prompt for AI Agents
In `@src/core/database.py` around lines 735 - 741, The code currently returns
cached_result directly from caching_manager.get_query_result (keyed by cache_key
built from search_term and limit), which lets callers mutate the cached object;
instead, return a shallow copy before returning to callers — e.g., create a new
list from cached_result (and if the list contains mutable mapping items,
shallow-copy each dict) so that modifications by the caller do not mutate the
cached data; update the branch that handles the cache hit (the block that logs
via logger.info and returns cached_result) to return the copied data.



💡 What: Added query result caching to
DatabaseManager.search_emails_with_limitand aclear_query_cachemethod toEnhancedCachingManagerfor invalidation.🎯 Why: Searching emails (with disk-based content fallback) was slow for repeated queries (warm search).
📊 Impact: Reduced warm search time from ~0.32s to ~0.000015s (99.9% reduction) in local benchmarks.
🔬 Measurement: Verified with a benchmark script and new unit test
tests/core/test_search_query_cache.py. Existing tests were updated to accommodate the change.PR created automatically by Jules for task 15688622797709846583 started by @MasumRab
Summary by Sourcery
Add caching for email search query results and ensure caches are invalidated when emails change.
New Features:
Enhancements:
Tests:
Summary by CodeRabbit
Performance Improvements
Tests
✏️ Tip: You can customize this high-level summary in your review settings.