-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up evaluation with r-trees to find overlapping detections #4758
Speed up evaluation with r-trees to find overlapping detections #4758
Conversation
WalkthroughThe changes involve enhancements to the Intersection over Union (IoU) calculations in the FiftyOne library, focusing on improving code efficiency and clarity. Key modifications include streamlining IoU computation logic, refining return types in the IoU module, and adding new testing functionalities. Additionally, the library now includes a new dependency, "rtree," for spatial indexing, and adjustments in the 3D utilities for better instance variable management. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant IoU_Calculator
participant RTree_Index
User->>IoU_Calculator: Request IoU Calculation
IoU_Calculator->>RTree_Index: Build R-tree for Ground Truths
RTree_Index-->>IoU_Calculator: R-tree Ready
IoU_Calculator->>IoU_Calculator: Compute IoUs
IoU_Calculator-->>User: Return IoU Results
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Files selected for processing (7)
- fiftyone/utils/eval/coco.py (1 hunks)
- fiftyone/utils/eval/openimages.py (1 hunks)
- fiftyone/utils/iou.py (7 hunks)
- fiftyone/utils/utils3d.py (1 hunks)
- setup.py (1 hunks)
- tests/unittests/evaluation_tests.py (1 hunks)
- tests/unittests/utils3d_tests.py (2 hunks)
Additional context used
Ruff
fiftyone/utils/iou.py
553-553: Loop control variable
i
not used within loop body(B007)
650-653: Use ternary operator
union = pred_area if gt_crowd else pred_area + gt_area - inter
instead ofif
-else
-blockReplace
if
-else
-block withunion = pred_area if gt_crowd else pred_area + gt_area - inter
(SIM108)
666-666: Pointless comparison. Did you mean to assign a value? Otherwise, prepend
assert
or remove it.(B015)
Additional comments not posted (13)
setup.py (1)
63-63
: Approve addition of "rtree" dependency.The addition of "rtree" is aligned with the PR's objectives to enhance spatial indexing capabilities. Ensure compatibility with existing dependencies to prevent conflicts.
tests/unittests/utils3d_tests.py (1)
13-13
: Approve import ofnumpy.testing
asnptest
.The use of
numpy.testing
for assertions is appropriate for testing numerical computations in this context.fiftyone/utils/eval/coco.py (1)
530-532
: Approve optimization in_coco_evaluation_setup
.The changes to compute IoUs only for overlapping detections and directly update the
pred_ious
dictionary are likely to improve performance significantly. Ensure thorough testing to verify the accuracy of the IoU calculations.Verification successful
Approve optimization in
_coco_evaluation_setup
.The changes to compute IoUs only for overlapping detections and directly update the
pred_ious
dictionary are likely to improve performance significantly. Ensure thorough testing to verify the accuracy of the IoU calculations, as the function is used internally within the module.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the accuracy of IoU calculations after optimization. # Test: Search for the function usage. Expect: Only occurrences of the new signature. rg --type python -A 5 $'_coco_evaluation_setup'Length of output: 77
Script:
#!/bin/bash # Description: Verify the accuracy of IoU calculations after optimization. # Test: Search for the function usage. Expect: Only occurrences of the new signature. rg --type py -A 5 '_coco_evaluation_setup'Length of output: 1077
fiftyone/utils/iou.py (6)
88-92
: Approved: Changes tocompute_ious
function.The modifications to return a dictionary instead of an array when there are no predictions or ground truths enhance clarity and error handling. Ensure that unit tests cover these new return conditions.
Line range hint
522-568
: Approved: Enhancements to_compute_bbox_ious
with r-tree indexing.The integration of r-tree indexing for bounding box IoUs is a significant improvement that should enhance performance. Confirm these performance gains with benchmark tests.
Tools
Ruff
553-553: Loop control variable
i
not used within loop body(B007)
Line range hint
568-660
: Approved: Use of r-tree in_compute_polygon_ious
.Applying r-tree indexing to polygon IoUs is expected to enhance performance significantly. Consider adding more robust error handling for geometric operations to prevent potential issues with complex polygon interactions.
Tools
Ruff
650-653: Use ternary operator
union = pred_area if gt_crowd else pred_area + gt_area - inter
instead ofif
-else
-blockReplace
if
-else
-block withunion = pred_area if gt_crowd else pred_area + gt_area - inter
(SIM108)
666-666: Pointless comparison. Did you mean to assign a value? Otherwise, prepend
assert
or remove it.(B015)
747-757
: Approved: Changes to_compute_keypoint_similarities
.The update to return a dictionary instead of an array is consistent with other changes in the module and improves the handling of output data.
492-515
: Approved: Addition of_get_detection_box
.This new function effectively abstracts the extraction of bounding box dimensions, supporting both 2D and 3D detections. It enhances modularity and reusability within the module.
518-520
: Approved: Addition of_get_poly_box
.The function simplifies the process of obtaining bounding boxes for polygons by leveraging the newly added
_get_detection_box
. This is a practical enhancement that supports the use of r-trees in polygon IoU calculations.fiftyone/utils/utils3d.py (1)
170-170
: Approved change to rotation handling in_Box
class.The modification to store the
rotation
parameter as an instance variable (self.rotation
) enhances code clarity and maintainability by ensuring consistent use of the rotation matrix across the class's methods.tests/unittests/evaluation_tests.py (1)
1939-1947
: Refine IoU checking logic with edge case handling.The modifications to the
_check_iou
method enhance its robustness by explicitly handling the scenario when no overlap exists (expected IoU is zero). This is a critical improvement for accuracy in edge cases.Consider adding a comment explaining why the result should be an empty list when
expected_iou
is zero, to aid future maintainability.fiftyone/utils/eval/openimages.py (2)
Line range hint
25-93
: Configuration class is well-structured and documented.The
OpenImagesEvaluationConfig
class is comprehensive and includes detailed documentation for each parameter, which enhances maintainability and usability.
Line range hint
95-567
: Verify changes in evaluation logic.The
evaluate
method in theOpenImagesEvaluation
class has been modified to incorporate optimized IoU computation using r-trees. It is crucial to ensure that these changes are correctly implemented and align with the PR objectives to speed up the evaluation process.
fiftyone/utils/iou.py
Outdated
sims = {} | ||
for pred in preds: | ||
sims[pred.id] == [] | ||
for gt in gts: | ||
if classwise and pred.label != gt.label: | ||
continue | ||
|
||
gtp = list(itertools.chain.from_iterable(gt.points)) | ||
predp = list(itertools.chain.from_iterable(pred.points)) | ||
sims[i, j] = _compute_object_keypoint_similarity(gtp, predp) | ||
sim = _compute_object_keypoint_similarity(gtp, predp) | ||
sims[pred.id].append((gt.id, sim)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix required: Bug in _compute_polyline_similarities
.
The function should assign an empty list to sims[pred.id]
instead of using a comparison. Correct this to ensure proper functionality.
Apply this diff to fix the bug:
- sims[pred.id] == []
+ sims[pred.id] = []
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
sims = {} | |
for pred in preds: | |
sims[pred.id] == [] | |
for gt in gts: | |
if classwise and pred.label != gt.label: | |
continue | |
gtp = list(itertools.chain.from_iterable(gt.points)) | |
predp = list(itertools.chain.from_iterable(pred.points)) | |
sims[i, j] = _compute_object_keypoint_similarity(gtp, predp) | |
sim = _compute_object_keypoint_similarity(gtp, predp) | |
sims[pred.id].append((gt.id, sim)) | |
sims = {} | |
for pred in preds: | |
sims[pred.id] = [] | |
for gt in gts: | |
if classwise and pred.label != gt.label: | |
continue | |
gtp = list(itertools.chain.from_iterable(gt.points)) | |
predp = list(itertools.chain.from_iterable(pred.points)) | |
sim = _compute_object_keypoint_similarity(gtp, predp) | |
sims[pred.id].append((gt.id, sim)) |
Tools
Ruff
666-666: Pointless comparison. Did you mean to assign a value? Otherwise, prepend
assert
or remove it.(B015)
a656eee
to
02d7fde
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Files selected for processing (1)
- fiftyone/utils/iou.py (7 hunks)
Additional context used
Ruff
fiftyone/utils/iou.py
553-553: Loop control variable
i
not used within loop body(B007)
650-653: Use ternary operator
union = pred_area if gt_crowd else pred_area + gt_area - inter
instead ofif
-else
-blockReplace
if
-else
-block withunion = pred_area if gt_crowd else pred_area + gt_area - inter
(SIM108)
Additional comments not posted (9)
fiftyone/utils/iou.py (9)
14-14
: Addition ofrtree
import.The import of
rtree.index
asrti
is crucial for the new spatial indexing feature. Ensure that thertree
package is included in the project dependencies.
23-23
: Import fromutils3d
.The import of
compute_cuboid_iou
and_Box
fromutils3d
is essential for supporting 3D bounding box calculations. This aligns with the PR's objective to handle 3D detections.
88-92
: Handling empty inputs incompute_ious
.The function now returns an empty dictionary if
preds
orgts
are empty. This change improves clarity and prevents unnecessary computations.
492-515
: New function_get_detection_box
.This function is a significant addition, supporting both 2D and 3D bounding box calculations. It uses the
_Box
class for 3D calculations, which is consistent with the updated handling of 3D detections.
518-520
: New function_get_poly_box
.This helper function converts a polygon to a detection and then uses
_get_detection_box
to get its bounding box. It's a good example of code reuse and modularity.
539-565
: Integration ofrtree
for bounding box IoUs.The use of
rtree
to create spatial indexes for ground truths is a key optimization. This change allows the system to only compute IoUs for overlapping detections, significantly improving performance.Tools
Ruff
553-553: Loop control variable
i
not used within loop body(B007)
610-658
: Use ofrtree
for polygon IoUs.Similar to bounding boxes, polygons now also benefit from spatial indexing. This optimization is expected to reduce computational overhead significantly when dealing with complex geometries.
Tools
Ruff
650-653: Use ternary operator
union = pred_area if gt_crowd else pred_area + gt_area - inter
instead ofif
-else
-blockReplace
if
-else
-block withunion = pred_area if gt_crowd else pred_area + gt_area - inter
(SIM108)
664-674
: Updated return type for_compute_polyline_similarities
.The function now returns a dictionary instead of an array, aligning with changes in other similar functions. This consistency in return types across functions improves the API's predictability.
747-757
: Updated return type for_compute_keypoint_similarities
.Changing the return type to a dictionary enhances consistency with other similarity and IoU functions, facilitating easier integration and usage within the broader system.
rtree_index.insert(i, box) | ||
|
||
pred_ious = {} | ||
for i, pred in enumerate(preds): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Loop variable i
not used.
The loop variable i
in the intersection check is not used within the loop body. Consider removing it if it's truly unnecessary, or clarify its purpose if it's intended for future use.
Tools
Ruff
553-553: Loop control variable
i
not used within loop body(B007)
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## optimize_evaluation #4758 +/- ##
======================================================
Coverage ? 99.12%
======================================================
Files ? 49
Lines ? 16690
Branches ? 0
======================================================
Hits ? 16544
Misses ? 146
Partials ? 0
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Thanks for your contribution @kemaleren ! Someone will review your proposal. Re: saving results, we have been looking into that separately. Stay tuned. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @kemaleren, sorry it took this long for us to get to your PR!
(@benjaminpkane : @mwoodson1, @kaixi-wang and I each tested this out and we're seeing no red flags.)
Tests
- Supplied tests pass on the PR branch
- Supplied tests pass post-merge
=============================================================================================== test session starts ================================================================================================
platform darwin -- Python 3.11.9, pytest-7.3.1, pluggy-1.5.0
rootdir: /Users/dwiref/Voxel51/dev/oss/fiftyone
configfile: pytest.ini
plugins: asyncio-0.23.8, mock-3.10.0, dash-2.18.0, anyio-3.7.1, cov-4.0.0
asyncio: mode=Mode.STRICT
collected 33 items
unittests/evaluation_tests.py ................................. [100%]
================================================================================================= warnings summary =================================================================================================
tests/unittests/evaluation_tests.py::RegressionTests::test_custom_regression_evaluation
tests/unittests/evaluation_tests.py::RegressionTests::test_custom_regression_evaluation
/Users/dwiref/Voxel51/dev/oss/fiftyone/fiftyone/service/util.py:114: DeprecationWarning: connections() is deprecated and will be removed; use net_connections() instead
for conn in process.connections(kind="tcp"):
tests/unittests/evaluation_tests.py::RegressionTests::test_custom_regression_evaluation
/Users/dwiref/Voxel51/dev/oss/51o/lib/python3.11/site-packages/mongoengine/connection.py:173: DeprecationWarning: No uuidRepresentation is specified! Falling back to 'pythonLegacy' which is the default for pymongo 3.x. For compatibility with other MongoDB drivers this should be specified as 'standard' or '{java,csharp}Legacy' to work with older drivers in those languages. This will be changed to 'standard' in a future release.
warnings.warn(
tests/unittests/evaluation_tests.py::SegmentationTests::test_custom_segmentation_evaluation
tests/unittests/evaluation_tests.py::SegmentationTests::test_custom_segmentation_evaluation
/Users/dwiref/Voxel51/dev/oss/fiftyone/fiftyone/utils/eval/segmentation.py:377: UserWarning: Skipping sample with missing ground truth mask
warnings.warn(msg)
tests/unittests/evaluation_tests.py::SegmentationTests::test_custom_segmentation_evaluation
/Users/dwiref/Voxel51/dev/oss/fiftyone/fiftyone/utils/eval/segmentation.py:383: UserWarning: Skipping sample with missing prediction mask
warnings.warn(msg)
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
========================================================================================= 33 passed, 6 warnings in 11.71s ==========================================================================================
Observations
- Proposed changes work as advertised
- Docs notebook
./docs/source/tutorials/evaluate_detections.ipynb
runs (and faster).
Comments/Conclusion
This one's a go from me - thank you for your efforts!
What changes are proposed in this pull request?
The current evaluation methods generate IOUs for all pairs of detections, which is quadratic in both time and space. It's taking hours even though most detections don't overlap at all. This PR uses a spatial index (implemented in the
rtree
package) to only compute IOUs for detections that overlap. It takes the run time from hours to minutes on images with tens of thousands of detections.I ran a simple test with 10,000 detections. This change improved the run time from 945 seconds to 9.5 seconds.
It works for 2D and 3D bounding boxes, masked detections, and polygons. Keypoints and polylines are not changed at this time, except to update the return value of
compute_ious()
from an array to a dictionary, to match the others.This PR also fixes a bug in
utils3d._Box
, which was using the wrong rotation to compute the box's vertices. This also allows the cuboid IOU tests to pass.Note that providing an
eval_key
to save the results to the database is unreasonably slow, likely because of the overhead of saving the entire sample. That problem is unrelated to the detection matching issues solved here, but it should be addressed in another PR. For instance, in a test with 1,000 detections, the runtime witheval_key=None
goes from 9.3 seconds to 0.8 seconds. However, with aneval_key
, it is about 65 seconds in both cases.How is this patch tested? If it is not, please explain why.
evaluation_tests.py
confirms that the new results match the old ones. TheCuboidTests
class had to be updated because now the computed ious do not include results for non-overlapping detections.Release Notes
Is this a user-facing change that should be mentioned in the release notes?
notes for FiftyOne users.
It does speed up evaluations, so users with larger numbers of objects in their samples may be interested.
What areas of FiftyOne does this PR affect?
fiftyone
Python library changesSummary by CodeRabbit
New Features
Bug Fixes
Tests
Chores