Skip to content

fix(file_service): add CSV extension and 50MB size validation to store_upload()#254

Open
dncoder14 wants to merge 1 commit intoc2siorg:mainfrom
dncoder14:fix/store-upload-validation
Open

fix(file_service): add CSV extension and 50MB size validation to store_upload()#254
dncoder14 wants to merge 1 commit intoc2siorg:mainfrom
dncoder14:fix/store-upload-validation

Conversation

@dncoder14
Copy link
Copy Markdown

@dncoder14 dncoder14 commented Mar 31, 2026

Description

Fixes two silent security and reliability gaps in store_upload() inside app/services/file_service.py:

  1. No file type validation — any file extension was accepted and written to disk (e.g. .exe, .zip, .xlsx).
  2. No file size limit — a multi-GB upload would silently consume all available disk space.

validate_upload_file() in security.py already enforces these at the HTTP layer, but store_upload() operates at the service layer and can be called directly, so it must enforce its own invariants.

Changes:

  • Added MAX_FILE_SIZE = 50 * 1024 * 1024 (50 MB) module-level constant.
  • Validates file extension (case-insensitive .csv only) before any disk I/O; raises ValueError("Only CSV files are supported. Got: {ext}").
  • Reads file content to measure size; raises ValueError("File size {n}MB exceeds maximum allowed size of 50MB") if exceeded.
  • Calls file.file.seek(0) after the size check so shutil.copyfileobj writes the complete file to disk.
  • New test file tests/test_file_service.py with 11 tests covering all new validations.

Fixes #(issue number)

Type of Change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation update

How Has This Been Tested?

All tests run with pytest against Python 3.12 using the project's own pyproject.toml dependencies. Disk I/O is mocked so tests are fast and hermetic.

  • Existing tests pass
  • New tests added
  • Manual testing

Test results:

11 passed, 1 warning in 0.13s

New test classes:

  • TestStoreUploadExtension (6 tests) — .csv and .CSV accepted; .xlsx, .exe, no-extension, .zip raise ValueError with the actual extension in the message.
  • TestStoreUploadSize (4 tests) — under limit accepted, exactly at limit accepted, 1 byte over raises ValueError with actual MB in message.
  • TestStoreUploadPointerReset (1 test) — verifies disk file bytes match original content byte-for-byte, proving seek(0) runs before the write.

Screenshots (if applicable)

N/A

Checklist

  • My code follows the project's style guidelines
  • I have performed a self-review
  • I have added/updated documentation as needed
  • My changes generate no new warnings
  • Tests pass locally

…e_upload()

Problems fixed:
- store_upload() accepted any file extension, not just .csv
- store_upload() placed no cap on file size; huge files could silently
  consume all available disk space

Changes to app/services/file_service.py:
- Add MAX_FILE_SIZE constant (50 * 1024 * 1024 bytes)
- Validate file.filename extension (case-insensitive) before any disk
  I/O; raise ValueError("Only CSV files are supported. Got: {ext}")
  if the extension is not .csv
- Read file content to measure size; raise
  ValueError("File size {n}MB exceeds maximum allowed size of 50MB")
  if size > MAX_FILE_SIZE
- Seek file pointer back to 0 after the size check so that
  shutil.copyfileobj writes the complete file content to disk

New file tests/test_file_service.py (11 tests, 3 classes):
- TestStoreUploadExtension: .csv accepted, .CSV (uppercase) accepted,
  .xlsx / .exe / no-extension / .zip all raise ValueError with the
  correct message
- TestStoreUploadSize: under limit accepted, exactly at limit accepted,
  one byte over raises ValueError with actual MB in message
- TestStoreUploadPointerReset: disk file contains full content after
  store_upload(), proving the seek(0) is in the right place
@github-actions
Copy link
Copy Markdown

PR Review

Backend

  • Ruff lint: Run uv run ruff check --fix . in dataloom-backend/
  • Ruff format: Run uv run ruff format . in dataloom-backend/

This comment updates automatically on each push.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant