Skip to content

fix(file_tracking): use raw content hash for consistent change detection#2381

Merged
tusharmath merged 5 commits intomainfrom
bug/read-detected-as-change
Feb 11, 2026
Merged

fix(file_tracking): use raw content hash for consistent change detection#2381
tusharmath merged 5 commits intomainfrom
bug/read-detected-as-change

Conversation

@amitksingh1490
Copy link
Copy Markdown
Contributor

@amitksingh1490 amitksingh1490 commented Feb 11, 2026

Summary

Fix false positives in file change detection by using the raw content hash (ReadOutput.content_hash) directly instead of re-hashing truncated/formatted content. This ensures both the stored hash and the comparison hash are always derived from the same unprocessed file content, eliminating spurious "externally modified" notifications for files with long lines, 2000+ lines, or trailing newlines.

Problem

Files that were not modified were being reported as "externally changed." This affected both read-only and written files whenever the displayed content diverged from the raw file content.

Root Cause

The hash comparison in FileChangeDetector::detect() was inconsistent:

When What was hashed
File first accessed (stored hash) Raw full file content (fs_read.rs:164compute_hash(&full_content))
Change detection re-check Processed content after line truncation, range limiting to 2000 lines, and .lines().join("\n") reconstruction

These two representations diverge whenever a file has:

  • Any line longer than max_line_length (truncated by truncate_line)
  • More than 2000 lines (range-limited by resolve_range)
  • A trailing newline (stripped by .lines().join("\n"))

Fix

Changed FileChangeDetector to use ReadOutput.content_hash directly — the hash already computed from the full raw file by the read service — instead of re-hashing the processed output.content.

Before (file_tracking.rs):

let current_hash = match self.read_file_content(&file_path).await {
    Ok(content) => Some(compute_hash(&content)),  // hash of truncated content
    Err(_) => None,
};

After:

let current_hash = match self.read_file_hash(&file_path).await {
    Ok(hash) => Some(hash),  // ReadOutput.content_hash — hash of raw full file
    Err(_) => None,
};

Tests

Test Verifies
test_read_file_with_matching_hash_not_detected Read-only file with consistent raw hash produces no false positive
test_truncated_content_does_not_cause_false_positive File with long lines (truncated display) matches by raw hash
test_truncated_written_file_not_false_positive Written file with >2000 lines (range-limited display) matches by raw hash
4 existing tests Unchanged — real modifications, unreadable files, and duplicate prevention still work

Changed Files

  • crates/forge_app/src/file_tracking.rs — Use ReadOutput.content_hash directly; update mock to simulate raw vs truncated content divergence (+134 −17)

@github-actions github-actions Bot added the type: fix Iterations on existing features or infrastructure. label Feb 11, 2026
@amitksingh1490 amitksingh1490 marked this pull request as draft February 11, 2026 06:41
Instead of re-hashing the truncated/formatted content returned by
FsReadService, use ReadOutput.content_hash directly which is always
computed from the full raw file content. This fixes false positives
where files with long lines or >2000 lines were incorrectly reported
as externally modified.

Co-Authored-By: ForgeCode <noreply@forgecode.dev>
@amitksingh1490 amitksingh1490 force-pushed the bug/read-detected-as-change branch from fc5545d to 03a559d Compare February 11, 2026 06:45
@amitksingh1490 amitksingh1490 changed the title fix(file_tracking): skip read-only files in external change detection fix(file_tracking): use raw content hash for consistent change detection Feb 11, 2026
autofix-ci Bot and others added 3 commits February 11, 2026 06:47
Add 8 new tests covering real-world scenarios:
- Read then write same file (no change / externally modified)
- Write then read back (no false positive)
- Mixed read and write across multiple files
- Read-only file externally modified (correctly detected)
- Multiple patches then detect
- Write then undo then detect
- Truncated read then write

Co-Authored-By: ForgeCode <noreply@forgecode.dev>
@amitksingh1490 amitksingh1490 marked this pull request as ready for review February 11, 2026 06:59
@tusharmath tusharmath merged commit d18cdd3 into main Feb 11, 2026
11 checks passed
@tusharmath tusharmath deleted the bug/read-detected-as-change branch February 11, 2026 07:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type: fix Iterations on existing features or infrastructure.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants