Hi everyone and @dpascualhe ,
While I was looking through the codebase to prepare for the Cityscapes/SemanticKITTI dataset adapters, I noticed a couple of things in the core utilities (segmentation_metrics.py, io.py, and coco.py) that might cause confusing errors for researchers.
-
Asserts vs. Exceptions
Right now, the code uses assert statements to check file paths, directory existence, and array shapes. The issue is that if someone runs their evaluation with Python's optimize flag (python -O), these asserts are ignored. This means a missing file or a mismatched mask shape will bypass the check and cause a really cryptic PyTorch/NumPy crash later on.
-
Silent NaN bias in Metrics
While looking at segmentation_metrics.py, I also noticed that if a class is completely missing from an image (which happens a lot in urban datasets), the denominator becomes zero, resulting in a NaN. Because we use np.nanmean, these missing classes are silently dropped from the average, which might accidentally inflate the final Mean IoU score without the user knowing.
What I'd like to do in a PR:
Replace the assert checks with explicit FileNotFoundError, NotADirectoryError, and ValueError so it fails fast with a clear message.
Add a safe division helper or explicit NaN policy in get_iou so missing classes are handled properly (and renormalize the weighted averages).
Add a couple of simple pytest cases to make sure the error messages trigger and the macro-IoU calculates missing classes correctly.
I'd love to implement this minimal patch to make the foundation a bit more solid before we start bringing in the massive new datasets. Let me know if this sounds like a good plan and I'll open a PR!
Hi everyone and @dpascualhe ,
While I was looking through the codebase to prepare for the Cityscapes/SemanticKITTI dataset adapters, I noticed a couple of things in the core utilities (segmentation_metrics.py, io.py, and coco.py) that might cause confusing errors for researchers.
Asserts vs. Exceptions
Right now, the code uses assert statements to check file paths, directory existence, and array shapes. The issue is that if someone runs their evaluation with Python's optimize flag (python -O), these asserts are ignored. This means a missing file or a mismatched mask shape will bypass the check and cause a really cryptic PyTorch/NumPy crash later on.
Silent NaN bias in Metrics
While looking at segmentation_metrics.py, I also noticed that if a class is completely missing from an image (which happens a lot in urban datasets), the denominator becomes zero, resulting in a NaN. Because we use np.nanmean, these missing classes are silently dropped from the average, which might accidentally inflate the final Mean IoU score without the user knowing.
What I'd like to do in a PR:
Replace the assert checks with explicit FileNotFoundError, NotADirectoryError, and ValueError so it fails fast with a clear message.
Add a safe division helper or explicit NaN policy in get_iou so missing classes are handled properly (and renormalize the weighted averages).
Add a couple of simple pytest cases to make sure the error messages trigger and the macro-IoU calculates missing classes correctly.
I'd love to implement this minimal patch to make the foundation a bit more solid before we start bringing in the massive new datasets. Let me know if this sounds like a good plan and I'll open a PR!