⚡ Bolt: Implement query result caching for email search by MasumRab · Pull Request #421 · MasumRab/EmailIntelligence

MasumRab · 2026-01-28T20:42:07Z

💡 What: Added query result caching to DatabaseManager.search_emails_with_limit and a clear_query_cache method to EnhancedCachingManager for invalidation.
🎯 Why: Searching emails (with disk-based content fallback) was slow for repeated queries (warm search).
📊 Impact: Reduced warm search time from ~0.32s to ~0.000015s (99.9% reduction) in local benchmarks.
🔬 Measurement: Verified with a benchmark script and new unit test tests/core/test_search_query_cache.py. Existing tests were updated to accommodate the change.

PR created automatically by Jules for task 15688622797709846583 started by @MasumRab

Summary by Sourcery

Add caching for email search query results and ensure caches are invalidated when emails change.

New Features:

Introduce query result caching for email searches based on normalized search term and result limit.
Add ability to clear all cached query results in the enhanced caching manager.

Enhancements:

Wire query cache invalidation into email create and update paths to keep search results consistent.

Tests:

Add integration test to verify search query caching behavior and cache invalidation on email updates.
Update search performance tests to account for the new query cache interactions.

Summary by CodeRabbit

Performance Improvements
- Search operations are now cached, providing faster results for repeated queries with identical terms.
- Cache automatically invalidates when email records are modified.
Tests
- Added comprehensive test coverage for search caching and cache invalidation behavior.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

- Add `clear_query_cache` to `EnhancedCachingManager` - Cache results in `DatabaseManager.search_emails_with_limit` - Invalidate cache on email create/update/delete - Add integration test for search caching - Fix regression in `test_database_search_perf.py` Co-authored-by: MasumRab <8943353+MasumRab@users.noreply.github.com>

google-labs-jules · 2026-01-28T20:42:09Z

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.

For security, I will only act on instructions from the user who triggered this task.

bolt-new-by-stackblitz · 2026-01-28T20:42:10Z

Run & review this pull request in StackBlitz Codeflow.

sourcery-ai · 2026-01-28T20:42:13Z

Reviewer's Guide

Implements query result caching for email searches by adding a query cache layer around DatabaseManager.search_emails_with_limit, ensuring cache invalidation on email mutations, and extending the caching manager API plus tests to cover the new behavior and performance characteristics.

Sequence diagram for cached search_emails_with_limit query flow

sequenceDiagram
    participant Client
    participant DatabaseManager
    participant EnhancedCachingManager
    participant EmailStore

    Client->>DatabaseManager: search_emails_with_limit(search_term, limit)
    DatabaseManager->>DatabaseManager: build cache_key = search:lower(search_term):limit
    DatabaseManager->>EnhancedCachingManager: get_query_result(cache_key)
    alt cache hit
        EnhancedCachingManager-->>DatabaseManager: cached_result
        DatabaseManager-->>Client: cached_result
    else cache miss
        EnhancedCachingManager-->>DatabaseManager: None
        DatabaseManager->>EmailStore: get_emails(limit, offset) and apply search filtering
        EmailStore-->>DatabaseManager: filtered_emails
        DatabaseManager->>DatabaseManager: results = _add_category_details(filtered_emails)
        DatabaseManager->>EnhancedCachingManager: put_query_result(cache_key, results)
        DatabaseManager-->>Client: results
    end

Sequence diagram for query cache invalidation on email create and update

sequenceDiagram
    actor Client
    participant DatabaseManager
    participant EnhancedCachingManager
    participant EmailStore

    Client->>DatabaseManager: create_email(email_data)
    DatabaseManager->>EmailStore: insert email
    EmailStore-->>DatabaseManager: new_id, light_email_record, heavy_data
    DatabaseManager->>EnhancedCachingManager: put_email_content(new_id, heavy_data)
    DatabaseManager->>DatabaseManager: _sorted_emails_cache = None
    DatabaseManager->>EnhancedCachingManager: clear_query_cache()
    DatabaseManager-->>Client: _add_category_details(light_email_record)

    Client->>DatabaseManager: update_email_by_message_id(message_id, updates)
    DatabaseManager->>EmailStore: update email by message_id
    EmailStore-->>DatabaseManager: email_id, email_to_update
    DatabaseManager->>EnhancedCachingManager: invalidate_email_record(email_id)
    DatabaseManager->>DatabaseManager: _sorted_emails_cache = None
    DatabaseManager->>EnhancedCachingManager: clear_query_cache()
    DatabaseManager-->>Client: _add_category_details(email_to_update)

Updated class diagram for DatabaseManager and EnhancedCachingManager caching behavior

classDiagram
    class DatabaseManager {
        caching_manager: EnhancedCachingManager
        _sorted_emails_cache
        create_email(email_data: Dict)
        update_email_by_message_id(message_id: str, updates: Dict)
        update_email(email_id: int, email_data: Dict)
        search_emails_with_limit(search_term: str, limit: int)
        get_emails(limit: int, offset: int)
        _add_category_details(email: Dict)
    }

    class EnhancedCachingManager {
        email_record_cache
        query_cache
        put_email_content(email_id: int, heavy_data)
        get_email_record(email_id: int)
        put_email_record(email_id: int, email_record: Dict)
        invalidate_email_record(email_id: int)
        get_query_result(query_key: str)
        put_query_result(query_key: str, result: List)
        invalidate_query_result(query_key: str)
        clear_query_cache()
        clear_all_caches()
    }

    DatabaseManager --> EnhancedCachingManager: uses for email and query caching

File-Level Changes

Change	Details	Files
Add query result caching to email search with stable cache keys and write-through behavior.	Short-circuit search_emails_with_limit when search_term is empty to use existing get_emails path (unchanged behavior). Introduce a normalized (lowercased) cache key of the form 'search:{term}:{limit}' per query to look up cached results via the caching manager before executing the search loop. On cache hit, immediately return the cached list of emails and log a cache-hit event for observability. After computing filtered and sorted results, wrap them with _add_category_details, store them in the query cache via put_query_result using the same cache key, then return the cached list.	`src/core/database.py`
Ensure query cache invalidation on any email-create or email-update operations.	On create_email, after persisting data and optionally caching email content, clear the in-memory sorted emails cache and also clear the entire query cache via the caching manager. On update_email_by_message_id and update_email, after invalidating the per-email cache and resetting the sorted emails cache, clear the query cache to avoid serving stale search results.	`src/core/database.py`
Extend EnhancedCachingManager with the ability to clear all cached query results.	Add clear_query_cache method that delegates to the underlying query_cache.clear() to remove all stored query results while leaving other caches intact. Keep clear_all_caches behavior unchanged so it still clears all cache segments including query, email record, and content caches.	`src/core/enhanced_caching.py`
Update and add tests to cover query caching and avoid interference in performance tests.	Add test_search_query_caching_integration to verify that search results are cached, reused on repeated queries, and invalidated on email update (including cache-key expectations). In the database search performance test fixture, explicitly stub get_query_result to return None to disable query caching and keep the performance measurement focused on the underlying search path.	`tests/core/test_search_query_cache.py` `tests/core/test_database_search_perf.py`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

github-actions · 2026-01-28T20:42:18Z

🤖 Hi @MasumRab, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

coderabbitai · 2026-01-28T20:42:25Z

Walkthrough

Query result caching is implemented for the search_emails_with_limit method using cache keys derived from lowercased search terms and limits. Cache invalidation is triggered when emails are created or updated. A new clear_query_cache() method is added to the caching manager. Integration tests verify caching behavior, cache invalidation on updates, and consistency between cold and warm cache paths.

Changes

Cohort / File(s)	Summary
Caching Layer Enhancement `src/core/enhanced_caching.py`	Added public method `clear_query_cache()` to provide dedicated API for clearing only the query result cache, complementing existing `clear_all_caches()`.
Database Search Caching `src/core/database.py`	Integrated query result caching into `search_emails_with_limit` with cache key format `"search:{term.lower()}:{limit}"`. Cache invalidation triggered in `create_email`, `update_email_by_message_id`, and `update_email` methods via `clear_query_cache()` calls.
Test Infrastructure `tests/core/test_database_search_perf.py`	Updated test mocking to include `get_query_result` method on caching manager, explicitly ensuring query cache miss path is tested.
Integration Tests `tests/core/test_search_query_cache.py`	New comprehensive test module validating query caching lifecycle: cache population on first search, cache hit on repeated searches, cache invalidation on email update, and new cache entry creation with updated search terms.

Sequence Diagram

sequenceDiagram
    participant Client
    participant DatabaseManager
    participant EnhancedCachingManager
    participant Database

    rect rgba(100, 200, 150, 0.5)
    Note over Client,Database: First Search (Cache Miss)
    Client->>DatabaseManager: search_emails_with_limit(term, limit)
    DatabaseManager->>EnhancedCachingManager: get_query_result(cache_key)
    EnhancedCachingManager-->>DatabaseManager: None (cache miss)
    DatabaseManager->>Database: Execute search query
    Database-->>DatabaseManager: Results
    DatabaseManager->>EnhancedCachingManager: put_query_result(cache_key, results)
    DatabaseManager-->>Client: Results
    end

    rect rgba(150, 150, 200, 0.5)
    Note over Client,Database: Second Search (Cache Hit)
    Client->>DatabaseManager: search_emails_with_limit(term, limit)
    DatabaseManager->>EnhancedCachingManager: get_query_result(cache_key)
    EnhancedCachingManager-->>DatabaseManager: Cached Results (hit)
    DatabaseManager-->>Client: Results
    end

    rect rgba(200, 150, 100, 0.5)
    Note over Client,Database: Update Email (Cache Invalidation)
    Client->>DatabaseManager: update_email(...)
    DatabaseManager->>Database: Update operation
    DatabaseManager->>EnhancedCachingManager: clear_query_cache()
    EnhancedCachingManager->>EnhancedCachingManager: Clear query cache
    DatabaseManager-->>Client: Confirmation
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Cached searches hop with speed,
No more digging when we need,
Updates clear the dusty trail,
Fresh results will never fail!
Query magic, fast and neat, ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 72.73% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change—implementing query result caching for email search—which is the primary feature across all modified files.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch bolt-search-query-cache-opt-15688622797709846583

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 Pylint (4.0.4)

src/core/database.py

src/core/enhanced_caching.py

tests/core/test_database_search_perf.py

1 others

Warning

Review ran into problems

🔥 Problems

Errors were encountered while retrieving linked issues.

Errors (1)

OPT-15688622797709846583: Entity not found: Issue - Could not find referenced Issue.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-01-28T20:43:19Z

🤖 I'm sorry @MasumRab, but I was unable to process your request. Please see the logs for more details.

sonarqubecloud · 2026-01-28T20:43:21Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

sourcery-ai

Hey - I've left some high level feedback:

The query cache key only includes the normalized search term and limit; if search_emails_with_limit ever adds filters (e.g., user/mailbox/category), consider including those in the key now to avoid subtle cache pollution later.
Logging a cache hit at info level on every warm search may be noisy in production; consider switching this to debug or adding rate limiting if high-frequency queries are expected.
EnhancedCachingManager.clear_all_caches does not currently clear the new query_cache; consider including query_cache.clear() there so the method semantics remain truly 'clear everything'.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The query cache key only includes the normalized search term and limit; if `search_emails_with_limit` ever adds filters (e.g., user/mailbox/category), consider including those in the key now to avoid subtle cache pollution later.
- Logging a cache hit at `info` level on every warm search may be noisy in production; consider switching this to `debug` or adding rate limiting if high-frequency queries are expected.
- `EnhancedCachingManager.clear_all_caches` does not currently clear the new `query_cache`; consider including `query_cache.clear()` there so the method semantics remain truly 'clear everything'.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@src/core/database.py`:
- Around line 735-741: The code currently returns cached_result directly from
caching_manager.get_query_result (keyed by cache_key built from search_term and
limit), which lets callers mutate the cached object; instead, return a shallow
copy before returning to callers — e.g., create a new list from cached_result
(and if the list contains mutable mapping items, shallow-copy each dict) so that
modifications by the caller do not mutate the cached data; update the branch
that handles the cache hit (the block that logs via logger.info and returns
cached_result) to return the copied data.

🧹 Nitpick comments (2)

src/core/database.py (1)

795-799: Caching implementation looks good, but shares mutable references.

The caching of search results is correctly implemented. However, the cached results list contains the same dictionary objects that are stored in emails_data and various indexes (via _add_category_details which modifies in-place). This extends the mutable data concern raised above—mutations to the underlying email data could inadvertently affect cached search results.

If immutability is desired, consider caching deep copies or storing only email IDs in the query cache and reconstructing results on cache hit.
tests/core/test_search_query_cache.py (1)
19-57: Good integration test coverage for the caching lifecycle.

The test effectively validates:

Cache population on cold search

Cache hit on warm search

Cache invalidation after email update

Re-caching with new search terms

The lowercase normalization is implicitly tested (searching "Test" creates key "search:test:10").

Consider adding additional test cases for:

Different limit values creating separate cache entries (e.g., "search:test:10" vs "search:test:20")

Verifying that create_email also invalidates the cache (not just update_email)
`@pytest.mark.asyncio`
async def test_search_cache_invalidated_on_create(db_manager):
    """Test that creating a new email invalidates the query cache."""
    await db_manager._ensure_initialized()
    
    # Populate cache
    await db_manager.create_email({
        "messageId": "msg-1",
        "subject": "First Email",
        "sender": "sender@example.com",
    })
    results1 = await db_manager.search_emails_with_limit("First", limit=10)
    cache_key = "search:first:10"
    assert db_manager.caching_manager.get_query_result(cache_key) is not None
    
    # Create another email - should invalidate cache
    await db_manager.create_email({
        "messageId": "msg-2",
        "subject": "Second Email",
        "sender": "sender@example.com",
    })
    
    # Cache should be cleared
    assert db_manager.caching_manager.get_query_result(cache_key) is None

coderabbitai · 2026-01-28T20:47:05Z

src/core/database.py

+        # Check query cache
+        # Normalize search term to lower case for consistent caching
+        cache_key = f"search:{search_term.lower()}:{limit}"
+        cached_result = self.caching_manager.get_query_result(cache_key)
+        if cached_result is not None:
+            logger.info(f"Query cache hit for term: '{search_term}'")
+            return cached_result


⚠️ Potential issue | 🟠 Major

Cached results returned by reference may allow callers to corrupt the cache.

The cached result is returned directly without copying. If a caller modifies the returned list or its contained dictionaries, those modifications will persist in the cache and affect subsequent cache hits. Consider returning a shallow copy.

🛠️ Proposed fix

# Check query cache # Normalize search term to lower case for consistent caching cache_key = f"search:{search_term.lower()}:{limit}" cached_result = self.caching_manager.get_query_result(cache_key) if cached_result is not None: logger.info(f"Query cache hit for term: '{search_term}'") - return cached_result + return [email.copy() for email in cached_result]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Check query cache

# Normalize search term to lower case for consistent caching

cache_key = f"search:{search_term.lower()}:{limit}"

cached_result = self.caching_manager.get_query_result(cache_key)

if cached_result is not None:

logger.info(f"Query cache hit for term: '{search_term}'")

return cached_result

# Check query cache

# Normalize search term to lower case for consistent caching

cache_key = f"search:{search_term.lower()}:{limit}"

cached_result = self.caching_manager.get_query_result(cache_key)

if cached_result is not None:

logger.info(f"Query cache hit for term: '{search_term}'")

return [email.copy() for email in cached_result]

🤖 Prompt for AI Agents

In `@src/core/database.py` around lines 735 - 741, The code currently returns cached_result directly from caching_manager.get_query_result (keyed by cache_key built from search_term and limit), which lets callers mutate the cached object; instead, return a shallow copy before returning to callers — e.g., create a new list from cached_result (and if the list contains mutable mapping items, shallow-copy each dict) so that modifications by the caller do not mutate the cached data; update the branch that handles the cache hit (the block that logs via logger.info and returns cached_result) to return the copied data.

sourcery-ai bot reviewed Jan 28, 2026

View reviewed changes

coderabbitai bot reviewed Jan 28, 2026

View reviewed changes

MasumRab merged commit 57b5aa1 into main Feb 4, 2026
31 of 33 checks passed

This was referenced Feb 21, 2026

⚡ Bolt: cache search results to eliminate disk I/O for repeated queries #438

Open

⚡ Bolt: Implement search query caching #412

Open

⚡ Bolt: Implement search query caching #432

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

⚡ Bolt: Implement query result caching for email search#421

⚡ Bolt: Implement query result caching for email search#421
MasumRab merged 1 commit intomainfrom
bolt-search-query-cache-opt-15688622797709846583

MasumRab commented Jan 28, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

google-labs-jules bot commented Jan 28, 2026

Uh oh!

bolt-new-by-stackblitz bot commented Jan 28, 2026

Uh oh!

sourcery-ai bot commented Jan 28, 2026 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

github-actions bot commented Jan 28, 2026

Uh oh!

coderabbitai bot commented Jan 28, 2026 •

edited

Loading

Review ran into problems

Uh oh!

github-actions bot commented Jan 28, 2026

Uh oh!

sonarqubecloud bot commented Jan 28, 2026

Uh oh!

sourcery-ai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MasumRab commented Jan 28, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Summary by CodeRabbit

Uh oh!

google-labs-jules bot commented Jan 28, 2026

Uh oh!

bolt-new-by-stackblitz bot commented Jan 28, 2026

Uh oh!

sourcery-ai bot commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for cached search_emails_with_limit query flow

Sequence diagram for query cache invalidation on email create and update

Updated class diagram for DatabaseManager and EnhancedCachingManager caching behavior

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

github-actions bot commented Jan 28, 2026

Uh oh!

coderabbitai bot commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Review ran into problems

Uh oh!

github-actions bot commented Jan 28, 2026

Uh oh!

sonarqubecloud bot commented Jan 28, 2026

Quality Gate passed

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

MasumRab commented Jan 28, 2026 •

edited by coderabbitai bot

Loading

sourcery-ai bot commented Jan 28, 2026 •

edited

Loading

coderabbitai bot commented Jan 28, 2026 •

edited

Loading