Skip to content

fix: handle empty/whitespace source content without retry loop#576

Merged
lfnovo merged 1 commit into
mainfrom
fix/empty-source-vectorization-loop
Feb 14, 2026
Merged

fix: handle empty/whitespace source content without retry loop#576
lfnovo merged 1 commit into
mainfrom
fix/empty-source-vectorization-loop

Conversation

@lfnovo
Copy link
Copy Markdown
Owner

@lfnovo lfnovo commented Feb 14, 2026

Summary

  • Fix Source.vectorize() wrapping its own ValueError in DatabaseOperationError, which bypassed the stop_on=[ValueError] retry guard and caused up to 15 retries when processing files with no extractable text
  • Skip vectorization gracefully in save_source() graph node when content is empty/whitespace-only
  • Add .strip() validation to catch whitespace-only content

Root Cause

Source.vectorize() had a broad except Exception that caught its own ValueError and re-raised it as DatabaseOperationError. The process_source_command retry config uses stop_on=[ValueError] to prevent retrying validation errors, but since the exception was now DatabaseOperationError, the retry logic kicked in — repeating the same failing operation up to 15 times with exponential backoff, blocking sync API requests indefinitely.

Test plan

  • Added tests for vectorize() with None, empty, whitespace-only, and valid content
  • All 119 existing tests pass with zero regressions

Fixes #560

Source.vectorize() wrapped its own ValueError in DatabaseOperationError,
bypassing the stop_on=[ValueError] retry guard in process_source_command.
This caused up to 15 retries when processing files with no extractable
text, blocking sync API requests indefinitely.

- Re-raise ValueError directly in Source.vectorize() instead of wrapping
- Add .strip() check to catch whitespace-only content
- Skip vectorization gracefully in save_source() when content is empty
- Add unit tests for vectorize error handling

Fixes #560
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 3 files

@lfnovo lfnovo merged commit 26d5349 into main Feb 14, 2026
8 checks passed
@lfnovo lfnovo deleted the fix/empty-source-vectorization-loop branch February 14, 2026 21:09
@lfnovo lfnovo mentioned this pull request Feb 15, 2026
demobdev pushed a commit to demobdev/open-notebook that referenced this pull request Feb 25, 2026
…o#576)

Source.vectorize() wrapped its own ValueError in DatabaseOperationError,
bypassing the stop_on=[ValueError] retry guard in process_source_command.
This caused up to 15 retries when processing files with no extractable
text, blocking sync API requests indefinitely.

- Re-raise ValueError directly in Source.vectorize() instead of wrapping
- Add .strip() check to catch whitespace-only content
- Skip vectorization gracefully in save_source() when content is empty
- Add unit tests for vectorize error handling

Fixes lfnovo#560
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Problems handling a file that is empty, or has only space characters

1 participant