Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix objdet confusion matrix OOM #797

Merged
merged 9 commits into from
Oct 11, 2024
Merged

Conversation

czaloom
Copy link
Collaborator

@czaloom czaloom commented Oct 11, 2024

Changes

Tweaked how boolean masks are created within the confusion matrix computation.

  • 60% faster confusion matrix computation!
    • 0.2s vs. 0.52s for previous best.
  • No more OOM for reasonably large and dense datasets.

Testing

Tested up to 10,000 datums each with 300 bounding boxes and 5 labels.

{
    'n_datums': 10000, 
    'n_groundtruths': 3,000,000, 
    'n_predictions': 3,000,000, 
    'n_labels': 5,
}

Original Issue

The confusion matrix computation will run out of memory (OOM) if an upper bound on dataset size is hit.

Reproducible example created by @rsbowman-striveworks

from random import choice, uniform

from valor_lite.object_detection import BoundingBox, DataLoader, Detection

def _generate_random_detections(
    n_detections: int, n_boxes: int, labels: str
) -> list[Detection]:
    def bbox(is_prediction):
        xmin, ymin = uniform(0, 10), uniform(0, 10)
        xmax, ymax = uniform(xmin, 15), uniform(ymin, 15)
        kw = {"scores": [uniform(0, 1)]} if is_prediction else {}
        return BoundingBox(
            xmin,
            xmax,
            ymin,
            ymax,
            [choice(labels)],
            **kw,
        )

    return [
        Detection(
            uid=f"uid{i}",
            groundtruths=[bbox(is_prediction=False) for _ in range(n_boxes)],
            predictions=[bbox(is_prediction=True) for _ in range(n_boxes)],
        )
        for i in range(n_detections)
    ]

def test_fuzz_confusion_matrix():
    dets = _generate_random_detections(1000, 30, "abcde")
    loader = DataLoader()
    loader.add_bounding_boxes(dets)
    evaluator = loader.finalize()
    evaluator.evaluate(
        iou_thresholds=[0.25, 0.75],
        score_thresholds=[0.5],
    )

@czaloom czaloom added bug Something isn't working improvement labels Oct 11, 2024
@czaloom czaloom self-assigned this Oct 11, 2024
@czaloom czaloom linked an issue Oct 11, 2024 that may be closed by this pull request
1 task
@czaloom czaloom marked this pull request as ready for review October 11, 2024 21:37
@czaloom czaloom merged commit 667a31a into main Oct 11, 2024
16 checks passed
@czaloom czaloom deleted the czaloom-objdet-confusion-matrix-oom branch October 11, 2024 21:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Obj Det confusion matrix OOM
2 participants