Skip to content

Add visual diff base HTML snapshot for stable PR comparisons#12928

Draft
ericholscher wants to merge 2 commits intomainfrom
claude/design-visual-diff-upgrade-ZhFB0
Draft

Add visual diff base HTML snapshot for stable PR comparisons#12928
ericholscher wants to merge 2 commits intomainfrom
claude/design-visual-diff-upgrade-ZhFB0

Conversation

@ericholscher
Copy link
Copy Markdown
Member

Summary

  • Extends Snapshot base manifest on first PR build to fix stale branch diffs #12922 (file tree diff manifest snapshot) to also snapshot the base version's rendered HTML so visual diff (doc_diff) remains stable when the base branch advances
  • On first PR build, copies *.html files from the base version into the PR's diff/ storage (external/diff/{project}/{pr_slug}/base_html/)
  • Adds a new proxito route (/_/diff/{version}/base/{path}) to serve these frozen snapshots
  • Updates doc_diff.base_url in the addons API to point to the snapshot URL for external versions, falling back to the live base URL when no snapshot exists

Design decisions

  • Only *.html files are copied (via rclone --include=*.html), skipping static assets — visual diff only needs the rendered HTML to extract <main> content
  • PR-colocated storage — the snapshot lives inside the PR version's existing diff/ storage tree, so cleanup is automatic when the PR version is deleted (no reference counting needed)
  • S3-to-S3 server-side copy — new rclone copy_remote() method avoids downloading data to the builder; uses S3 CopyObject API
  • Idempotent — if a snapshot already exists, subsequent PR builds skip the copy

Depends on

Test plan

  • Verify doc_diff.base_url points to snapshot path for PR versions with a base snapshot
  • Verify fallback to live base URL when no snapshot exists
  • Verify rclone copy_remote with --include=*.html only copies HTML files
  • Verify snapshot cleanup happens automatically when PR version is deleted
  • Verify rebase clears the snapshot (follow-up, aligns with Clear base manifest snapshot on PR synchronize events #12923)

https://claude.ai/code/session_01MczHXuiDZUFE8TLFhV1xXs

When a PR is first built, copy the base version's *.html files into the
PR's diff storage (external/diff/{project}/{pr_slug}/base_html/). This
frozen snapshot ensures visual diff compares against the base state at
PR creation time, not the current tip of the base branch.

Key changes:
- Add rclone copy_remote() for S3-to-S3 server-side copies
- Add snapshot_base_html() to copy base HTML during PR post-build
- Add proxito route /_/diff/{version}/base/{path} to serve snapshots
- Update doc_diff.base_url in addons API to use snapshot when available
- Cleanup is automatic via existing PR version deletion flow

https://claude.ai/code/session_01MczHXuiDZUFE8TLFhV1xXs
doc_diff = r.json()["addons"]["doc_diff"]
# Should fall back to the live latest version URL.
assert "/en/latest/index.html" in doc_diff["base_url"]
assert "project.dev.readthedocs.io" in doc_diff["base_url"]
- Fix import ordering in serve.py: move filetreediff import to
  alphabetical position among readthedocs.* imports
- Fix test_snapshot_base_html: mock get_base_snapshot and
  write_base_snapshot directly instead of blanket storage mock
  that broke the write path

https://claude.ai/code/session_01MczHXuiDZUFE8TLFhV1xXs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants