Unable to Reproduce Paper Results with Provided Pre-trained Models

Hi, thank you for releasing your code and pre-trained models. I've been trying to reproduce the results from your paper on FFHQ and CelebA-HQ but am observing significant discrepancies from the reported performance. I wanted to document my findings in case there's something I'm missing or a known issue.

## Environment & Setup

- **Repository commit:** `eb4e77c84d00678e409343b52d9804b8ff31f467` (retrieved 2026-01-14)
- **Pre-trained models:** Iteration 700,000 checkpoints downloaded from BaiduCloud (as linked in README)
- **Test data:** FFHQ and CelebA-HQ with mixed contamination types (scribbles + rectangular image patches)

## Observed Issues

I've attached a comparison image showing Input → GT → Predicted Mask → Output → Predicted Output columns across 4 test samples.

![Image](https://github.com/user-attachments/assets/e909c7a6-db36-445d-8944-2f9d4cba7bbe)

### 1. Complete Failure on Rectangular Occlusions

The model fails entirely on square/rectangular image patches (rows 2 & 4 in attached image):
- The calendar text obstruction remains **fully visible** in the output
- The predicted mask localizes only a small central region rather than the full occlusion
- The 'Output column shows extreme noise (scattered white pixels across the entire image), suggesting the mask estimation has collapsed

This contrasts with the paper's Table 3 results showing strong performance on "Image Occlusion" contamination patterns.

### 2. Poor Texture Quality on Successful Detections

Even where the model correctly identifies contamination (scribble rows 1 & 3):
- Inpainted regions exhibit over-smoothing / "plastic skin" artifacts
- Loss of high-frequency detail (pores, skin texture) compared to GT
- Color blending issues leaving visible discoloration blobs (especially around mouth/chin areas)

### 3. Facial Feature Reconstruction

- Lips appear undefined and blurry when occluded
- Eye reconstruction shows asymmetry and lack of definition

## Questions

1. Are the BaiduCloud checkpoints the same ones used to generate the paper's quantitative results?
2. Is there specific preprocessing required for the test images beyond resizing to 256×256?
3. Were the paper results obtained with a different contamination synthesis pipeline than what's in the released code?
4. Any known issues with certain contamination types (solid rectangles vs. irregular masks)?

## Attached

- `comparison.jpg`: Side-by-side comparison showing the issues described above

I'd appreciate any guidance on reproducing the reported results. Happy to provide additional details or test specific configurations if helpful.

Thanks for your time!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to Reproduce Paper Results with Provided Pre-trained Models #5

Environment & Setup

Observed Issues

1. Complete Failure on Rectangular Occlusions

2. Poor Texture Quality on Successful Detections

3. Facial Feature Reconstruction

Questions

Attached

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unable to Reproduce Paper Results with Provided Pre-trained Models #5

Description

Environment & Setup

Observed Issues

1. Complete Failure on Rectangular Occlusions

2. Poor Texture Quality on Successful Detections

3. Facial Feature Reconstruction

Questions

Attached

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions