Skip to content

[Enhancement + Bug] Better error handling for data ingestion and missing classes #546

@karttikjangid

Description

@karttikjangid

Hi everyone and @dpascualhe ,

While I was looking through the codebase to prepare for the Cityscapes/SemanticKITTI dataset adapters, I noticed a couple of things in the core utilities (segmentation_metrics.py, io.py, and coco.py) that might cause confusing errors for researchers.

  1. Asserts vs. Exceptions
    Right now, the code uses assert statements to check file paths, directory existence, and array shapes. The issue is that if someone runs their evaluation with Python's optimize flag (python -O), these asserts are ignored. This means a missing file or a mismatched mask shape will bypass the check and cause a really cryptic PyTorch/NumPy crash later on.

  2. Silent NaN bias in Metrics
    While looking at segmentation_metrics.py, I also noticed that if a class is completely missing from an image (which happens a lot in urban datasets), the denominator becomes zero, resulting in a NaN. Because we use np.nanmean, these missing classes are silently dropped from the average, which might accidentally inflate the final Mean IoU score without the user knowing.

What I'd like to do in a PR:

Replace the assert checks with explicit FileNotFoundError, NotADirectoryError, and ValueError so it fails fast with a clear message.

Add a safe division helper or explicit NaN policy in get_iou so missing classes are handled properly (and renormalize the weighted averages).

Add a couple of simple pytest cases to make sure the error messages trigger and the macro-IoU calculates missing classes correctly.

I'd love to implement this minimal patch to make the foundation a bit more solid before we start bringing in the massive new datasets. Let me know if this sounds like a good plan and I'll open a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions