An end-to-end PyTorch implementation of U-Net for binary image segmentation: from data preparation through the encoder–decoder with skip connections to training, evaluation, and visualization. Applied to brain tumor segmentation - a challenging medical imaging task with subtle boundaries and class imbalance.
U‑Net‑PyTorch/
├── U‑Net/ # directory containing the notebook and checkpoint
│ ├── U‑Net.ipynb # end-to-end pipeline notebook (brain-tumor segmentation)
│ │ ├── Dataset loading & preprocessing
│ │ ├── U‑Net architecture definition
│ │ ├── Loss & optimizer setup
│ │ ├── Training loop with logging & checkpointing
│ │ ├── Dice coefficient evaluation
│ │ └── Visualization of predictions vs. ground truth
│ └── checkpoint.md # checkpoint containing epoch, model_state_dict, optimizer_state_dict, lr_scheduler_state_dict, loss, and dice_score from the best epoch
│ ├── epoch
│ ├── model_state_dict
│ ├── optimizer_state_dict
│ ├── lr_scheduler_state_dict
│ ├── loss
│ └── dice_score
├── segmentation_comparison.png # overlay: image + ground truth mask | image + predicted mask
├── mask_comparison.png # side‑by‑side: image | ground truth | prediction
├── LICENSE # MIT License text
└── README.md # this document
Benchmark run — trained for 64 epochs, results:
-
Train
- Dice Score: 0.93
- Loss: 0.0948
-
Test
- Dice Score: 0.80
- Loss: 0.2422
Trained weights are included at U‑Net/checkpoint.pth.
Below is an example output produced by the U-Net pipeline while being tested:
• Python 3.8+
• pip
• (Optional) CUDA‑enabled GPU
Install packages:
pip install torch torchvision numpy matplotlib pillow tqdm
# optional
pip install opencv-python albumentations jupyterlab
- Clone the repository
git clone https://github.com/franciszekparma/U-Net-PyTorch.git
cd U-Net-PyTorch
- Launch Jupyter
jupyter lab # or: jupyter notebook
- Open and run
U‑Net/U‑Net.ipynb
Expected layout (customize paths in the notebook if needed):
DATASET_ROOT/
├── segmentation_task/
│ ├── train/
│ │ ├── images/
│ │ └── masks/
│ └── test/
│ ├── images/
│ └── masks/
...
• Masks are interpreted as binary; if stored as {0, 255}, they are normalized to {0, 1}.
• Images are resized/normalized in transforms; keep image size consistent across training/eval.
• For imbalanced foreground, consider stronger augmentation or changing the loss / adding a weight to a paritucal part of the loss.
- Epochs: 64 (reference)
- Metric: Dice coefficient (reported on train/test)
- Loss: BCEWithLogitsLoss + Dice
- Checkpointing: best weights saved to
checkpoint.pth
• Start with a moderate image size if GPU memory is limited.
• Use stronger data augmentation techniques (Horizontal Flip, ShiftScaleRotate, Blur, etc.)
• Use the Albumentations library for image augmentation (strongly recommended)
• Monitor Dice and loss together; verify thresholds used for binarization.
• Save state_dict for portability.
• CUDA out of memory → reduce batch size or image size; ensure tensors are moved off GPU when not needed.
• All‑black or all‑white outputs → check mask normalization and loss/thresholding.
• Tensor size mismatch on skip connections → verify resize/crop/stride consistency.
• Experiment with the code! This is the best way to understand / learn all the code / theory related to the given topic.
Issues and PRs are welcome — bug fixes, training tips, alternative losses (Focal/Tversky/...), multi‑class extensions, documentation improvements, other improvements to the implementation.
• Ronneberger, Fischer, Brox — U‑Net: Convolutional Networks for Biomedical Image Segmentation (MICCAI 2015)
• Creators of BRISC 2025 dataset
This project is licensed under the MIT License.
© franciszekparma

