Skip to content

Fix: Add missing delete_time index to FileTrash table#8694

Open
yaoge123 wants to merge 3 commits intohaiwen:masterfrom
yaoge123:fix/add-filetrash-delete-time-index
Open

Fix: Add missing delete_time index to FileTrash table#8694
yaoge123 wants to merge 3 commits intohaiwen:masterfrom
yaoge123:fix/add-filetrash-delete-time-index

Conversation

@yaoge123
Copy link
Contributor

Fix: Add missing delete_time index to FileTrash table

Problem

The FileTrash table is missing the delete_time index that is defined in the Python model but never added to mysql.sql.

Python Model Definition

File: seafevents/events/models.py

class FileTrash(Base):
    __tablename__ = 'FileTrash'
    
    id = mapped_column(Integer, primary_key=True, autoincrement=True)
    delete_time = mapped_column(DateTime, nullable=False, index=True)  # <-- index=True
    # ...

Current mysql.sql (Missing Index)

CREATE TABLE `FileTrash` (
  -- ... columns ...
  PRIMARY KEY (`id`),
  KEY `ix_FileTrash_repo_id` (`repo_id`)
  -- Missing: KEY `ix_FileTrash_delete_time` (`delete_time`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;

Code Usage

Global Cleanup Query

File: seafevents/events/db.py

def clean_up_file_trash(session, repo_id=None):
    _timestamp = datetime.datetime.now() - timedelta(days=FILE_TRASH_CLEANING_DAYS)
    
    if repo_id is not None:
        # Uses repo_id index
        stmt = delete(FileTrash).where(FileTrash.repo_id == repo_id, FileTrash.delete_time < _timestamp)
    else:
        # GLOBAL CLEANUP - No repo_id filter!
        # Line ~304
        stmt = delete(FileTrash).where(FileTrash.delete_time < _timestamp)  # <-- Needs delete_time index

Query Pattern: WHERE delete_time < ?
Execution Context: Scheduled cleanup task (runs periodically)

Repository-level Cleanup

File: seafevents/events/handlers.py

# Line ~616
if trash_item and trash_item.delete_time > _timestamp:
    # ...

Why This Index is Critical

Scenario Query Without Index With Index
Global cleanup WHERE delete_time < ? Full table scan Index range scan
Repository cleanup WHERE repo_id = ? AND delete_time < ? Uses repo_id index Uses repo_id index (compound covers)

The Global Cleanup Problem

The scheduled cleanup task runs without a repo_id filter:

DELETE FROM FileTrash WHERE delete_time < '2024-01-01'

Without the delete_time index:

  • Full table scan on potentially millions of rows
  • High CPU and I/O usage during cleanup
  • Risk of locking the table for extended periods

Changes

Add the missing index to sql/mysql.sql:

CREATE TABLE `FileTrash` (
  -- ... existing columns ...
  PRIMARY KEY (`id`),
  KEY `ix_FileTrash_repo_id` (`repo_id`),
  KEY `ix_FileTrash_delete_time` (`delete_time`)  -- Added
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;

Note on Compound Index

PR #8680 (seahub) / PR #606 (seafevents) also adds a compound index (repo_id, delete_time) for the repository-specific query pattern:

WHERE repo_id = ? AND delete_time > ? ORDER BY delete_time DESC

However, the single-column delete_time index is still needed for:

  1. Global cleanup (WHERE delete_time < ? without repo_id filter)
  2. The compound index (repo_id, delete_time) cannot satisfy queries that don't include repo_id

Both indexes should coexist:

Verification

This index is already defined in the Python model (seafevents/events/models.py) but was simply forgotten in mysql.sql.

The delete_time index is defined in the Python model (seafevents)
with index=True but was missing from mysql.sql. This index is needed
for the global cleanup query: WHERE delete_time < ?

Without this index, the cleanup task performs a full table scan on
FileTrash which can have millions of rows.
…ries

Seadoc thumbnails use file_uuid as directory name, requiring a lookup
from repo_id+path to uuid. The composite index (md5, filename, is_dir)
eliminates table lookups by covering the full WHERE clause.
Avatar.objects.filter(emailuser=...) is used frequently throughout
the codebase but lacked an index, causing full table scans.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant