Skip to content

DetectionMetricsFactory AP/mAP computation evaluates predictions unsorted, causing precision degradation #544

@shardulmahamulkar

Description

@shardulmahamulkar

Issue description

When performing standard detection evaluation, prediction mapping protocols (such as COCO and PASCAL VOC) require bounding boxes to be evaluated in descending order of their confidence scores.

Currently, DetectionMetricsFactory._match_predictions() processes the predictions in the arbitrary order they appear in the original arrays. This leads to a critical evaluation error where a lower-confidence bounding box might arrive earlier in the list and claim the "True Positive" match for a ground-truth object. A subsequent, much more accurate high-confidence box targeting the exact same object then has no available ground-truth and gets logged as a "False Positive".

This degrades precision over recall bounds and results in an unjustly penalized AP/mAP score.

Steps to reproduce the behavior:

  • Construct a scenario where a ground-truth object has two overlapping predicted bounding boxes.
  • Box A (Score: 0.5) overlaps the object and appears at index 0.
  • Box B (Score: 0.99) overlaps the object very tightly and appears at index 1.
  • Run metrics_factory.update().
  • Measure AP. Because Box A claims the True Positive, Box B acts as a False Positive penalizing the precision at high confidence thresholds.

Expected behavior

  • Predictions should be mapped strictly in score-descending order to ensure highest-confidence outputs are mapped to ground truth pairs first.
  • This guarantees precision/recall logic maintains strict monotonicity and aligns with correct AP calculation.

Proposed Fix

Modify _match_predictions() located in perceptionmetrics/utils/detection_metrics.py to calculate sort indices via np.argsort() and iterate across them:

# perceptionmetrics/utils/detection_metrics.py: ~L130
ious = compute_iou_matrix(pred_boxes, gt_boxes)
# FIX: Sort predictions by descending score
sorted_indices = np.argsort([-s for s in pred_scores])
for i in sorted_indices:
    p_box = pred_boxes[i]
    p_label = pred_labels[i]
    score = pred_scores[i]
    
    max_iou = 0
    max_j = -1
    ...

Environment Details

System OS: Occurs on all OS
File: perceptionmetrics/utils/detection_metrics.py

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions