Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated evaluate_detections with bulk get and set values #5418

Open
wants to merge 9 commits into
base: develop
Choose a base branch
from

Conversation

minhtuev
Copy link
Contributor

@minhtuev minhtuev commented Jan 22, 2025

What changes are proposed in this pull request?

Updated evaluate_detections with bulk get and set values

  • Coco-style evaluation (✅ )
  • ActivityNet evaluation (✅ )

How is this patch tested? If it is not, please explain why.

  • Updated unit test for bulk update (✅ )
  • Ran intensive tests locally (✅ ) and verified that the evaluation time is reduced from ~ 6s to 1s

Release Notes

Is this a user-facing change that should be mentioned in the release notes?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release
    notes for FiftyOne users.

(Details in 1-2 sentences. You can just refer to another PR with a description
if this PR is part of a larger change.)

What areas of FiftyOne does this PR affect?

  • App: FiftyOne application changes
  • Build: Build and test infrastructure changes
  • Core: Core fiftyone Python library changes
  • Documentation: FiftyOne documentation changes
  • Other

Summary by CodeRabbit

Release Notes

  • New Features

    • Added support for bulk evaluation in detection tasks.
    • Introduced new internal evaluation methods for ActivityNet, COCO, and OpenImages evaluations.
  • Improvements

    • Refactored evaluation logic to improve code modularity and readability.
    • Enhanced flexibility in handling different evaluation scenarios.
  • Testing

    • Added new unit tests for detection evaluations, including ActivityNet dataset scenarios.

The changes focus on improving the internal evaluation framework with more robust and flexible processing methods.

Copy link
Contributor

coderabbitai bot commented Jan 22, 2025

Walkthrough

The pull request introduces refactoring across several evaluation utility files in the FiftyOne library. The changes primarily focus on creating new private _evaluate methods in different evaluation classes like ActivityNet, COCO, and OpenImages. These methods encapsulate evaluation logic, improving code organization and readability. Additionally, the detection evaluation module receives significant updates, including new functions for bulk and standard evaluation processing. The modifications aim to streamline the evaluation workflow while maintaining existing functionality.

Changes

File Change Summary
fiftyone/utils/eval/activitynet.py Added private _evaluate method in ActivityNetEvaluation class to centralize evaluation logic.
fiftyone/utils/eval/coco.py Added private _evaluate method in COCOEvaluation class to streamline evaluation process.
fiftyone/utils/eval/detection.py Added _evaluate_detections_bulk function and updated evaluate_detections with new bulk parameter.
fiftyone/utils/eval/openimages.py Added placeholder _evaluate method that raises NotImplementedError.
tests/unittests/evaluation_tests.py Added new test methods for ActivityNet detection evaluation and updated existing evaluation test methods.

Possibly related PRs

Suggested labels

enhancement, bug

Suggested reviewers

  • swheaton
  • mwoodson1
  • brimoor

Poem

🐰 Evaluation's dance, a rabbit's delight,
Code refactored with algorithmic might.
Private methods hop, logic clean and bright,
Bulk or standard, metrics shine just right.
FiftyOne's framework leaps to new height! 🚀

✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@minhtuev minhtuev requested a review from swheaton January 22, 2025 21:10
@minhtuev minhtuev marked this pull request as ready for review January 22, 2025 21:10
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (6)
fiftyone/utils/eval/detection.py (3)

43-46: Simplify if-else block using a ternary operator

You can replace the if-else block with a ternary operator for conciseness.

Apply this diff to simplify the code:

-if processing_frames:
-    docs = sample.frames.values()
-else:
-    docs = [sample]
+docs = sample.frames.values() if processing_frames else [sample]
🧰 Tools
🪛 Ruff (0.8.2)

43-46: Use ternary operator docs = sample.frames.values() if processing_frames else [sample] instead of if-else-block

Replace if-else-block with docs = sample.frames.values() if processing_frames else [sample]

(SIM108)


77-77: Replace print statements with logging

Using print statements is not recommended for production code. Please replace them with appropriate logging calls to maintain consistency and control over logging levels.

Apply this diff:

-    print("Retrieving values from collection")
+    logger.info("Retrieving values from collection")
-    print("Finished retrieving values from collection")
+    logger.info("Finished retrieving values from collection")

Also applies to: 89-89


339-342: Simplify if-else block using a ternary operator

You can replace the if-else block with a ternary operator for conciseness.

Apply this diff to simplify the code:

-if processing_frames:
-    docs = sample.frames.values()
-else:
-    docs = [sample]
+docs = sample.frames.values() if processing_frames else [sample]
🧰 Tools
🪛 Ruff (0.8.2)

339-342: Use ternary operator docs = sample.frames.values() if processing_frames else [sample] instead of if-else-block

Replace if-else-block with docs = sample.frames.values() if processing_frames else [sample]

(SIM108)

tests/unittests/evaluation_tests.py (1)

9-9: Remove unused import pytest

The pytest module is imported but not used in the file. You can remove it to clean up the code.

Apply this diff:

-import pytest
🧰 Tools
🪛 Ruff (0.8.2)

9-9: pytest imported but unused

Remove unused import: pytest

(F401)

fiftyone/utils/eval/openimages.py (1)

158-159: Add docstring and type hints to improve code maintainability.

Consider adding a docstring and type hints to document the purpose of this placeholder method and improve code maintainability.

Apply this diff to add docstring and type hints:

-    def _evaluate(self, gts, preds, eval_key=None):
-        raise NotImplementedError("Use `evaluate` method instead")
+    def _evaluate(
+        self,
+        gts: "fiftyone.core.labels.Label",
+        preds: "fiftyone.core.labels.Label",
+        eval_key: str | None = None,
+    ) -> None:
+        """Private evaluation method placeholder.
+
+        This method is not implemented for OpenImages evaluation. Use the public
+        `evaluate` method instead.
+
+        Args:
+            gts: ground truth labels
+            preds: predicted labels
+            eval_key: optional evaluation key
+
+        Raises:
+            NotImplementedError: This method is not implemented
+        """
+        raise NotImplementedError("Use `evaluate` method instead")
fiftyone/utils/eval/activitynet.py (1)

92-101: Add docstring and type hints to improve code maintainability.

Consider adding a docstring and type hints to document the purpose of this method and improve code maintainability.

Apply this diff to add docstring and type hints:

-    def _evaluate(self, gts, preds, eval_key=None):
+    def _evaluate(
+        self,
+        gts: "fiftyone.core.labels.Label",
+        preds: "fiftyone.core.labels.Label",
+        eval_key: str | None = None,
+    ) -> list[tuple]:
+        """Performs ActivityNet-style evaluation on the given labels.
+
+        Args:
+            gts: ground truth labels
+            preds: predicted labels
+            eval_key: optional evaluation key
+
+        Returns:
+            a list of matched (gt_label, pred_label, iou, pred_confidence,
+            gt_id, pred_id) tuples
+        """
         if eval_key is None:
             # Don't save results on user's data
             eval_key = "eval"
             gts = _copy_labels(gts)
             preds = _copy_labels(preds)

         return _activitynet_evaluation_single_iou(
             gts, preds, eval_key, self.config
         )
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 073693d and 6b88241.

📒 Files selected for processing (5)
  • fiftyone/utils/eval/activitynet.py (2 hunks)
  • fiftyone/utils/eval/coco.py (2 hunks)
  • fiftyone/utils/eval/detection.py (4 hunks)
  • fiftyone/utils/eval/openimages.py (1 hunks)
  • tests/unittests/evaluation_tests.py (8 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
fiftyone/utils/eval/detection.py

43-46: Use ternary operator docs = sample.frames.values() if processing_frames else [sample] instead of if-else-block

Replace if-else-block with docs = sample.frames.values() if processing_frames else [sample]

(SIM108)


339-342: Use ternary operator docs = sample.frames.values() if processing_frames else [sample] instead of if-else-block

Replace if-else-block with docs = sample.frames.values() if processing_frames else [sample]

(SIM108)

tests/unittests/evaluation_tests.py

9-9: pytest imported but unused

Remove unused import: pytest

(F401)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: test / test-app
  • GitHub Check: lint / eslint
  • GitHub Check: build / build
🔇 Additional comments (2)
fiftyone/utils/eval/coco.py (1)

135-143: Well-structured encapsulation of evaluation logic

The new _evaluate method effectively encapsulates the evaluation logic, improving code organization and readability.

fiftyone/utils/eval/activitynet.py (1)

126-126: LGTM! Clean refactoring of the evaluate method.

The evaluation logic has been cleanly encapsulated in the _evaluate method, making the code more maintainable.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
fiftyone/utils/eval/detection.py (2)

30-32: Add progress callback support for bulk operations.

The bulk evaluation function should support progress callbacks like the standard evaluation path to maintain consistency and provide feedback for large datasets.

 def _evaluate_detections_bulk(
-    _samples, gt_field, pred_field, processing_frames, eval_method, eval_key
+    _samples, gt_field, pred_field, processing_frames, eval_method, eval_key, progress=None
 ):

298-301: Consider using a ternary operator for conciseness.

The if-else block could be simplified using a ternary operator, though the current form is also clear and readable.

-            if processing_frames:
-                docs = sample.frames.values()
-            else:
-                docs = [sample]
+            docs = sample.frames.values() if processing_frames else [sample]
🧰 Tools
🪛 Ruff (0.8.2)

298-301: Use ternary operator docs = sample.frames.values() if processing_frames else [sample] instead of if-else-block

Replace if-else-block with docs = sample.frames.values() if processing_frames else [sample]

(SIM108)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6b88241 and 1ad2352.

📒 Files selected for processing (1)
  • fiftyone/utils/eval/detection.py (4 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
fiftyone/utils/eval/detection.py

298-301: Use ternary operator docs = sample.frames.values() if processing_frames else [sample] instead of if-else-block

Replace if-else-block with docs = sample.frames.values() if processing_frames else [sample]

(SIM108)

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: test / test-app
  • GitHub Check: build / build
🔇 Additional comments (2)
fiftyone/utils/eval/detection.py (2)

106-106: LGTM! Clean parameter addition.

The new bulk parameter is well-documented and follows the function's existing parameter pattern.


286-286: Review commented out field addition code.

The commented out line samples._dataset.add_field(field_name, fof.IntField) might be important for field initialization. Please clarify if this line should be removed or uncommented.

fiftyone/utils/eval/detection.py Outdated Show resolved Hide resolved
fiftyone/utils/eval/detection.py Outdated Show resolved Hide resolved
fiftyone/utils/eval/detection.py Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
fiftyone/utils/eval/detection.py (3)

33-64: Enhance thread safety and error handling in the spinner decorator.

The spinner implementation could be improved in the following areas:

  1. Add error handling in the spinner thread
  2. Use threading.Event for thread control
  3. Ensure proper cleanup in case of exceptions
 def spinner_decorator(enabled=True):
     def decorator(func):
         def wrapper(*args, **kwargs):
             if not enabled:
                 return func(*args, **kwargs)
 
-            spinner = itertools.cycle(
-                ["⠋", "⠙", "⠹", "⠸", "⠼", "⠴", "⠦", "⠧", "⠇", "⠏"]
-            )
-            stop_spinner = False
+            spinner_event = threading.Event()
+            spinner_chars = ["⠋", "⠙", "⠹", "⠸", "⠼", "⠴", "⠦", "⠧", "⠇", "⠏"]
 
             def spin():
-                while not stop_spinner:
-                    sys.stdout.write(next(spinner))
-                    sys.stdout.flush()
-                    time.sleep(0.1)
-                    sys.stdout.write("\b")
+                try:
+                    while not spinner_event.is_set():
+                        for char in spinner_chars:
+                            if spinner_event.is_set():
+                                break
+                            sys.stdout.write(char)
+                            sys.stdout.flush()
+                            time.sleep(0.1)
+                            sys.stdout.write("\b")
+                except Exception as e:
+                    logger.error(f"Spinner thread error: {e}")
+                finally:
+                    sys.stdout.write(" \b")
+                    sys.stdout.flush()
 
             spinner_thread = threading.Thread(target=spin)
             spinner_thread.daemon = True
             spinner_thread.start()
 
-            result = func(*args, **kwargs)
-            stop_spinner = True
-            spinner_thread.join()
-
-            return result
+            try:
+                return func(*args, **kwargs)
+            finally:
+                spinner_event.set()
+                spinner_thread.join(timeout=1.0)

66-108: Consider batch processing for memory optimization.

The bulk evaluation function loads all data into memory at once, which could be problematic for large datasets. Consider implementing batch processing to manage memory usage more efficiently.

Would you like me to help implement a batched version of the bulk evaluation function that processes data in chunks?


325-328: Simplify code using ternary operator.

Replace the if-else block with a more concise ternary expression.

-            if processing_frames:
-                docs = sample.frames.values()
-            else:
-                docs = [sample]
+            docs = sample.frames.values() if processing_frames else [sample]
🧰 Tools
🪛 Ruff (0.8.2)

325-328: Use ternary operator docs = sample.frames.values() if processing_frames else [sample] instead of if-else-block

Replace if-else-block with docs = sample.frames.values() if processing_frames else [sample]

(SIM108)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1ad2352 and c4ba0a9.

📒 Files selected for processing (1)
  • fiftyone/utils/eval/detection.py (5 hunks)
🧰 Additional context used
📓 Learnings (1)
fiftyone/utils/eval/detection.py (1)
Learnt from: minhtuev
PR: voxel51/fiftyone#5418
File: fiftyone/utils/eval/detection.py:263-265
Timestamp: 2025-01-23T05:30:24.943Z
Learning: FiftyOne automatically creates fields if they don't exist when using set_values(), making explicit field existence validation unnecessary.
🪛 Ruff (0.8.2)
fiftyone/utils/eval/detection.py

325-328: Use ternary operator docs = sample.frames.values() if processing_frames else [sample] instead of if-else-block

Replace if-else-block with docs = sample.frames.values() if processing_frames else [sample]

(SIM108)

⏰ Context from checks skipped due to timeout of 90000ms (4)
  • GitHub Check: test / test-app
  • GitHub Check: lint / eslint
  • GitHub Check: build / build
  • GitHub Check: build
🔇 Additional comments (1)
fiftyone/utils/eval/detection.py (1)

219-220: LGTM! Well-documented bulk parameter addition.

The bulk parameter is properly documented with clear explanation of its purpose and default value.

@swheaton
Copy link
Contributor

Added spinner animation

Sorry what's this about?

Also please post the results for larger/varying image datasets (no video or group yet right?) that you got.

@minhtuev
Copy link
Contributor Author

@swheaton : haha I got a little bit restless looking at the terminal without seeing anything since fetching from collections can take a while, so I added some animation :) we can remove it from the final PR.

I ran the evaluate_detections against the bdd-100K dataset; the result is 6 minutes vs. 36 minutes without bulk evaluation. I am seeing some issues with saving the results back into the collection and still debugging it.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
fiftyone/utils/eval/detection.py (3)

312-312: Remove commented code.

The commented line # samples._dataset.add_field(field_name, fof.IntField) appears to be unnecessary as per the retrieved learning that FiftyOne automatically creates fields.

-                    # samples._dataset.add_field(field_name, fof.IntField)

324-327: Simplify conditional assignment using ternary operator.

The if-else block can be simplified using a ternary operator.

-            if processing_frames:
-                docs = sample.frames.values()
-            else:
-                docs = [sample]
+            docs = sample.frames.values() if processing_frames else [sample]
🧰 Tools
🪛 Ruff (0.8.2)

324-327: Use ternary operator docs = sample.frames.values() if processing_frames else [sample] instead of if-else-block

Replace if-else-block with docs = sample.frames.values() if processing_frames else [sample]

(SIM108)


297-302: Consolidate progress parameter handling.

The progress parameter handling is redundant with the spinner decorator's enabled parameter.

-            if progress is None:
-                progress = fo.config.show_progress_bars
-
-            decorated_set_values = spinner_decorator(enabled=progress)(
+            decorated_set_values = spinner_decorator(enabled=progress if progress is not None else fo.config.show_progress_bars)(
                 samples.set_values
             )
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c4ba0a9 and 637e7fe.

📒 Files selected for processing (1)
  • fiftyone/utils/eval/detection.py (5 hunks)
🧰 Additional context used
📓 Learnings (1)
fiftyone/utils/eval/detection.py (1)
Learnt from: minhtuev
PR: voxel51/fiftyone#5418
File: fiftyone/utils/eval/detection.py:263-265
Timestamp: 2025-01-23T05:30:24.943Z
Learning: FiftyOne automatically creates fields if they don't exist when using set_values(), making explicit field existence validation unnecessary.
🪛 Ruff (0.8.2)
fiftyone/utils/eval/detection.py

324-327: Use ternary operator docs = sample.frames.values() if processing_frames else [sample] instead of if-else-block

Replace if-else-block with docs = sample.frames.values() if processing_frames else [sample]

(SIM108)

⏰ Context from checks skipped due to timeout of 90000ms (6)
  • GitHub Check: test / test-python (ubuntu-latest-m, 3.10)
  • GitHub Check: build / build
  • GitHub Check: lint / eslint
  • GitHub Check: e2e / test-e2e
  • GitHub Check: test / test-app
  • GitHub Check: build
🔇 Additional comments (1)
fiftyone/utils/eval/detection.py (1)

66-106: Consider memory optimization and error handling in bulk evaluation.

The function loads all samples into memory at once, which could be problematic for large datasets. Additionally, error handling could be improved.

Consider implementing batched processing:

 def _evaluate_detections_bulk(
     _samples, gt_field, pred_field, eval_method, eval_key, progress=True
 ):
+    BATCH_SIZE = 1000  # Configurable batch size
     matches = []
     id_field = "id"
+    total_samples = _samples.count()
+    sample_updates = {"sample_tp": {}, "sample_fp": {}, "sample_fn": {}}
 
-    logger.info(
-        f"Retrieving values for {id_field}, {gt_field}, {pred_field} from collection"
-    )
+    for offset in range(0, total_samples, BATCH_SIZE):
+        batch = _samples.skip(offset).limit(BATCH_SIZE)
+        logger.info(
+            f"Processing batch {offset//BATCH_SIZE + 1}/{(total_samples + BATCH_SIZE - 1)//BATCH_SIZE}"
+        )

Comment on lines +33 to +63
# Probably move this into the util library
def spinner_decorator(enabled=True):
def decorator(func):
def wrapper(*args, **kwargs):
if not enabled:
return func(*args, **kwargs)

spinner = itertools.cycle(
["⠋", "⠙", "⠹", "⠸", "⠼", "⠴", "⠦", "⠧", "⠇", "⠏"]
)
stop_spinner = False

def spin():
while not stop_spinner:
sys.stdout.write(next(spinner))
sys.stdout.flush()
time.sleep(0.1)
sys.stdout.write("\b")

spinner_thread = threading.Thread(target=spin)
spinner_thread.start()

result = func(*args, **kwargs)
stop_spinner = True
spinner_thread.join()

return result

return wrapper

return decorator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider error handling and cleanup in the spinner decorator.

The spinner thread should be properly cleaned up even when the decorated function raises an exception.

 def wrapper(*args, **kwargs):
     if not enabled:
         return func(*args, **kwargs)

     spinner = itertools.cycle(
         ["⠋", "⠙", "⠹", "⠸", "⠼", "⠴", "⠦", "⠧", "⠇", "⠏"]
     )
     stop_spinner = False

     def spin():
         while not stop_spinner:
             sys.stdout.write(next(spinner))
             sys.stdout.flush()
             time.sleep(0.1)
             sys.stdout.write("\b")

     spinner_thread = threading.Thread(target=spin)
     spinner_thread.start()

-    result = func(*args, **kwargs)
-    stop_spinner = True
-    spinner_thread.join()
-
-    return result
+    try:
+        result = func(*args, **kwargs)
+        return result
+    finally:
+        stop_spinner = True
+        spinner_thread.join()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Probably move this into the util library
def spinner_decorator(enabled=True):
def decorator(func):
def wrapper(*args, **kwargs):
if not enabled:
return func(*args, **kwargs)
spinner = itertools.cycle(
["⠋", "⠙", "⠹", "⠸", "⠼", "⠴", "⠦", "⠧", "⠇", "⠏"]
)
stop_spinner = False
def spin():
while not stop_spinner:
sys.stdout.write(next(spinner))
sys.stdout.flush()
time.sleep(0.1)
sys.stdout.write("\b")
spinner_thread = threading.Thread(target=spin)
spinner_thread.start()
result = func(*args, **kwargs)
stop_spinner = True
spinner_thread.join()
return result
return wrapper
return decorator
# Probably move this into the util library
def spinner_decorator(enabled=True):
def decorator(func):
def wrapper(*args, **kwargs):
if not enabled:
return func(*args, **kwargs)
spinner = itertools.cycle(
["⠋", "⠙", "⠹", "⠸", "⠼", "⠴", "⠦", "⠧", "⠇", "⠏"]
)
stop_spinner = False
def spin():
while not stop_spinner:
sys.stdout.write(next(spinner))
sys.stdout.flush()
time.sleep(0.1)
sys.stdout.write("\b")
spinner_thread = threading.Thread(target=spin)
spinner_thread.start()
try:
result = func(*args, **kwargs)
return result
finally:
stop_spinner = True
spinner_thread.join()
return wrapper
return decorator

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
fiftyone/utils/eval/detection.py (2)

66-114: Consider implementing batched processing for large datasets.

Loading the entire collection at once could lead to memory issues with large datasets. The comment on line 84 correctly suggests implementing batched processing.

I can help implement a batched version of this function that processes the data in chunks. Would you like me to open an issue to track this enhancement?


333-336: Simplify conditional assignment using ternary operator.

The code can be more concise by using a ternary operator.

Apply this diff to simplify the code:

-            if processing_frames:
-                docs = sample.frames.values()
-            else:
-                docs = [sample]
+            docs = sample.frames.values() if processing_frames else [sample]
🧰 Tools
🪛 Ruff (0.8.2)

333-336: Use ternary operator docs = sample.frames.values() if processing_frames else [sample] instead of if-else-block

Replace if-else-block with docs = sample.frames.values() if processing_frames else [sample]

(SIM108)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 637e7fe and 8d58b84.

📒 Files selected for processing (1)
  • fiftyone/utils/eval/detection.py (5 hunks)
🧰 Additional context used
📓 Learnings (1)
fiftyone/utils/eval/detection.py (1)
Learnt from: minhtuev
PR: voxel51/fiftyone#5418
File: fiftyone/utils/eval/detection.py:263-265
Timestamp: 2025-01-23T05:30:24.943Z
Learning: FiftyOne automatically creates fields if they don't exist when using set_values(), making explicit field existence validation unnecessary.
🪛 Ruff (0.8.2)
fiftyone/utils/eval/detection.py

333-336: Use ternary operator docs = sample.frames.values() if processing_frames else [sample] instead of if-else-block

Replace if-else-block with docs = sample.frames.values() if processing_frames else [sample]

(SIM108)

⏰ Context from checks skipped due to timeout of 90000ms (6)
  • GitHub Check: test / test-python (ubuntu-latest-m, 3.11)
  • GitHub Check: build / build
  • GitHub Check: test / test-python (ubuntu-latest-m, 3.10)
  • GitHub Check: e2e / test-e2e
  • GitHub Check: test / test-python (ubuntu-latest-m, 3.9)
  • GitHub Check: build
🔇 Additional comments (1)
fiftyone/utils/eval/detection.py (1)

33-63: ⚠️ Potential issue

Add proper cleanup for spinner thread.

The spinner thread is not properly cleaned up when the decorated function raises an exception. This could lead to zombie threads and incorrect console output.

Apply this diff to ensure proper cleanup:

 def wrapper(*args, **kwargs):
     if not enabled:
         return func(*args, **kwargs)

     spinner = itertools.cycle(
         ["⠋", "⠙", "⠹", "⠸", "⠼", "⠴", "⠦", "⠧", "⠇", "⠏"]
     )
     stop_spinner = False

     def spin():
         while not stop_spinner:
             sys.stdout.write(next(spinner))
             sys.stdout.flush()
             time.sleep(0.1)
             sys.stdout.write("\b")

     spinner_thread = threading.Thread(target=spin)
     spinner_thread.start()

-    result = func(*args, **kwargs)
-    stop_spinner = True
-    spinner_thread.join()
-
-    return result
+    try:
+        return func(*args, **kwargs)
+    finally:
+        stop_spinner = True
+        spinner_thread.join()

Likely invalid or redundant comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants