Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for custom evaluation metrics #5279

Merged
merged 16 commits into from
Jan 23, 2025
Merged

Conversation

brimoor
Copy link
Contributor

@brimoor brimoor commented Dec 16, 2024

See voxel51/fiftyone-plugins#190

Summary by CodeRabbit

  • New Features

    • Added support for custom metrics across various evaluation methods in FiftyOne.
    • Enhanced evaluation configurations to include custom metric specifications.
    • Introduced new base classes for more flexible evaluation methods.
    • Integrated custom metrics into the evaluation panel for better display of results.
  • Improvements

    • Updated evaluation result classes to handle custom metrics.
    • Expanded evaluation framework to support more dynamic metric computation.
    • Added utility methods for managing custom metrics in different evaluation contexts.
  • Technical Enhancements

    • Refactored evaluation method classes to improve modularity.
    • Added type safety and error handling for custom metric processing.
    • Improved logging and reporting of evaluation results.

@brimoor brimoor added the feature Work on a feature request label Dec 16, 2024
Copy link
Contributor

coderabbitai bot commented Dec 16, 2024

Walkthrough

This pull request introduces a comprehensive enhancement to the evaluation metrics framework across multiple files in the FiftyOne project. The changes primarily focus on adding support for custom metrics in various evaluation contexts, including classification, detection, segmentation, and regression. A new base infrastructure is established to handle custom metrics, with modifications to configuration, evaluation, and results classes to accommodate this functionality. The changes provide more flexibility for users to define and compute their own metrics during model evaluation.

Changes

File Change Summary
app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx Added formatCustomMetricRows function, new type definitions for CustomMetric, CustomMetrics, and SummaryRow
fiftyone/operators/__init__.py Added imports for EvaluationMetricConfig and EvaluationMetric
fiftyone/operators/evaluation_metric.py Introduced new classes EvaluationMetricConfig and EvaluationMetric with comprehensive metric handling methods
fiftyone/utils/eval/* Added custom_metrics parameter to multiple evaluation classes and methods across classification, detection, regression, segmentation, and other evaluation utilities
plugins/panels/model_evaluation/__init__.py Added get_custom_metrics method to EvaluationPanel
tests/unittests/operators/registry_tests.py Removed test_list_invalid_type_raises_error method and made formatting adjustments

Sequence Diagram

sequenceDiagram
    participant User
    participant EvaluationConfig
    participant EvaluationMethod
    participant Results
    
    User->>EvaluationConfig: Specify custom metrics
    EvaluationConfig->>EvaluationMethod: Pass custom metrics
    EvaluationMethod->>EvaluationMethod: Compute standard metrics
    EvaluationMethod->>EvaluationMethod: Compute custom metrics
    EvaluationMethod->>Results: Generate results with custom metrics
    Results-->>User: Return evaluation results
Loading

Possibly related PRs

Suggested labels

enhancement

Suggested reviewers

  • imanjra
  • ritch
  • minhtuev

Poem

🐰 Metrics dance, custom and bright,
Evaluation's new flexible might!
Code leaps with metrics galore,
Revealing insights never seen before.
Hop, hop, hooray for metric delight! 🎉

✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@brimoor brimoor changed the title Add support for custom evaluation metrics [WIP] Add support for custom evaluation metrics Dec 18, 2024
Base automatically changed from release/v1.2.0 to main December 20, 2024 22:41
@brimoor brimoor changed the base branch from main to develop December 21, 2024 14:41
@brimoor brimoor changed the title [WIP] Add support for custom evaluation metrics Add support for custom evaluation metrics Dec 21, 2024
@brimoor brimoor force-pushed the custom-metrics branch 2 times, most recently from ea26a26 to 4fa72a0 Compare December 21, 2024 17:21
@brimoor brimoor changed the base branch from develop to release/v1.3.0 January 21, 2025 03:17
@brimoor brimoor requested a review from ritch January 21, 2025 05:10
@brimoor brimoor marked this pull request as ready for review January 21, 2025 05:10
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (9)
fiftyone/operators/evaluation_metric.py (1)

43-43: Fix the typo in the docstring.

There's a typo in the word "evaluation" in the docstring of the get_parameters method.

Apply this diff to correct the typo:

-        """Defines any necessary properties to collect this evaluaiton metric's
+        """Defines any necessary properties to collect this evaluation metric's
fiftyone/utils/eval/regression.py (1)

587-587: Simplify the usage of dict.get.

The default value None in kwargs.get("custom_metrics", None) is unnecessary, as None is the default return value of dict.get when the key is missing. You can simplify the code by removing it.

Apply this diff:

     def _parse_config(pred_field, gt_field, method, **kwargs):
         if method is None:
             method = fo.evaluation_config.default_regression_backend

-        custom_metrics = kwargs.get("custom_metrics", None)
+        custom_metrics = kwargs.get("custom_metrics")
         if etau.is_str(custom_metrics):
             kwargs["custom_metrics"] = [custom_metrics]
🧰 Tools
🪛 Ruff (0.8.2)

587-587: Use kwargs.get("custom_metrics") instead of kwargs.get("custom_metrics", None)

Replace kwargs.get("custom_metrics", None) with kwargs.get("custom_metrics")

(SIM910)

fiftyone/utils/eval/base.py (2)

66-68: Access lower_is_better directly from the config.

Instead of accessing lower_is_better through operator.config.kwargs.get(...), you should access it directly as an attribute of operator.config since it's already defined in the EvaluationMetricConfig.

Apply this diff:

                         results.custom_metrics[operator.uri] = {
                             "value": value,
                             "label": operator.config.label,
-                            "lower_is_better": operator.config.kwargs.get(
-                                "lower_is_better", True
-                            ),
+                            "lower_is_better": operator.config.lower_is_better,
                         }

80-80: Simplify iteration over dictionary keys.

When iterating over a dictionary, you can iterate directly without calling .keys(), as this is the default behavior in Python. This makes the code more concise and Pythonic.

Apply these diffs:

At line 80:

         def get_custom_metric_fields(self, samples, eval_key):
             fields = []

-            for metric in self._get_custom_metrics().keys():
+            for metric in self._get_custom_metrics():

At line 96:

         def rename_custom_metrics(self, samples, eval_key, new_eval_key):
-            for metric in self._get_custom_metrics().keys():
+            for metric in self._get_custom_metrics():

At line 108:

         def cleanup_custom_metrics(self, samples, eval_key):
-            for metric in self._get_custom_metrics().keys():
+            for metric in self._get_custom_metrics():

Also applies to: 96-96, 108-108

🧰 Tools
🪛 Ruff (0.8.2)

80-80: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

fiftyone/utils/eval/detection.py (1)

641-641: Simplify usage of kwargs.get()

You can remove the default value None since kwargs.get("custom_metrics") returns None by default.

Apply this diff to simplify the code:

-    custom_metrics = kwargs.get("custom_metrics", None)
+    custom_metrics = kwargs.get("custom_metrics")
🧰 Tools
🪛 Ruff (0.8.2)

641-641: Use kwargs.get("custom_metrics") instead of kwargs.get("custom_metrics", None)

Replace kwargs.get("custom_metrics", None) with kwargs.get("custom_metrics")

(SIM910)

fiftyone/utils/eval/classification.py (1)

926-926: Simplify usage of kwargs.get()

You can remove the default value None since kwargs.get("custom_metrics") returns None by default.

Apply this diff to simplify the code:

-    custom_metrics = kwargs.get("custom_metrics", None)
+    custom_metrics = kwargs.get("custom_metrics")
🧰 Tools
🪛 Ruff (0.8.2)

926-926: Use kwargs.get("custom_metrics") instead of kwargs.get("custom_metrics", None)

Replace kwargs.get("custom_metrics", None) with kwargs.get("custom_metrics")

(SIM910)

app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx (1)

1-1: Use optional chaining instead of lodash get to simplify code

Using optional chaining and nullish coalescing operators can replace _.get, reducing dependencies and simplifying the code.

Apply this diff to refactor the code:

- import _ from "lodash";
+ // lodash import is no longer needed

...

- const customMetrics = _.get(
-   evaluationMetrics,
-   "custom_metrics",
-   {}
- ) as CustomMetrics;
+ const customMetrics = (evaluationMetrics.custom_metrics ?? {}) as CustomMetrics;

...

-     const compareValue = _.get(
-       comparisonMetrics,
-       `custom_metrics.${operatorUri}.value`,
-       null
-     );
+     const compareValue = comparisonMetrics?.custom_metrics?.[operatorUri]?.value ?? null;

Also applies to: 1785-1795

fiftyone/operators/__init__.py (1)

15-15: Add imported classes to __all__ to resolve unused import warnings.

The imported classes EvaluationMetricConfig and EvaluationMetric should be added to __all__ since they are intended to be part of the public API.

-from .evaluation_metric import EvaluationMetricConfig, EvaluationMetric
+from .evaluation_metric import EvaluationMetricConfig, EvaluationMetric
+
+__all__ = [k for k, v in globals().items() if not k.startswith("_")] + [
+    "EvaluationMetricConfig",
+    "EvaluationMetric",
+]
🧰 Tools
🪛 Ruff (0.8.2)

15-15: .evaluation_metric.EvaluationMetricConfig imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)


15-15: .evaluation_metric.EvaluationMetric imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

fiftyone/utils/eval/segmentation.py (1)

564-567: Optimize the kwargs.get() call.

The None default value in kwargs.get("custom_metrics", None) is redundant as it's the default return value of dict.get().

-    custom_metrics = kwargs.get("custom_metrics", None)
+    custom_metrics = kwargs.get("custom_metrics")
🧰 Tools
🪛 Ruff (0.8.2)

564-564: Use kwargs.get("custom_metrics") instead of kwargs.get("custom_metrics", None)

Replace kwargs.get("custom_metrics", None) with kwargs.get("custom_metrics")

(SIM910)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d228b88 and e98cea0.

📒 Files selected for processing (13)
  • app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx (3 hunks)
  • fiftyone/operators/__init__.py (1 hunks)
  • fiftyone/operators/evaluation_metric.py (1 hunks)
  • fiftyone/operators/registry.py (4 hunks)
  • fiftyone/utils/eval/activitynet.py (5 hunks)
  • fiftyone/utils/eval/base.py (6 hunks)
  • fiftyone/utils/eval/classification.py (20 hunks)
  • fiftyone/utils/eval/coco.py (5 hunks)
  • fiftyone/utils/eval/detection.py (16 hunks)
  • fiftyone/utils/eval/openimages.py (5 hunks)
  • fiftyone/utils/eval/regression.py (18 hunks)
  • fiftyone/utils/eval/segmentation.py (17 hunks)
  • plugins/panels/model_evaluation/__init__.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx (1)

Pattern **/*.{ts,tsx}: Review the Typescript and React code for conformity with best practices in React, Recoil, Graphql, and Typescript. Highlight any deviations.

🪛 Ruff (0.8.2)
fiftyone/operators/__init__.py

15-15: .evaluation_metric.EvaluationMetricConfig imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)


15-15: .evaluation_metric.EvaluationMetric imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

fiftyone/utils/eval/detection.py

641-641: Use kwargs.get("custom_metrics") instead of kwargs.get("custom_metrics", None)

Replace kwargs.get("custom_metrics", None) with kwargs.get("custom_metrics")

(SIM910)

fiftyone/utils/eval/segmentation.py

564-564: Use kwargs.get("custom_metrics") instead of kwargs.get("custom_metrics", None)

Replace kwargs.get("custom_metrics", None) with kwargs.get("custom_metrics")

(SIM910)

fiftyone/utils/eval/base.py

80-80: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


96-96: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


108-108: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

fiftyone/utils/eval/classification.py

926-926: Use kwargs.get("custom_metrics") instead of kwargs.get("custom_metrics", None)

Replace kwargs.get("custom_metrics", None) with kwargs.get("custom_metrics")

(SIM910)

fiftyone/utils/eval/regression.py

587-587: Use kwargs.get("custom_metrics") instead of kwargs.get("custom_metrics", None)

Replace kwargs.get("custom_metrics", None) with kwargs.get("custom_metrics")

(SIM910)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: build
🔇 Additional comments (11)
fiftyone/operators/registry.py (1)

9-10: LGTM! The changes enhance operator filtering capabilities.

The implementation correctly handles both string and class type parameters, with proper type conversion and error handling.

Also applies to: 46-47, 91-92, 111-114

plugins/panels/model_evaluation/__init__.py (1)

149-154: LGTM! The custom metrics integration is well-implemented.

The implementation includes proper error handling and seamlessly integrates with the existing evaluation data structure.

Also applies to: 373-373

fiftyone/utils/eval/activitynet.py (1)

43-44: LGTM! Custom metrics support is properly integrated.

The implementation includes clear documentation and correctly propagates custom metrics to the parent class.

Also applies to: 55-55, 59-64

fiftyone/utils/eval/segmentation.py (4)

41-42: LGTM!

The custom_metrics parameter is well-placed and properly documented.


145-146: LGTM!

The documentation for custom_metrics parameter is clear and follows the established format.


149-161: LGTM!

The custom_metrics parameter is properly integrated into the constructor and correctly initialized through the superclass.


514-520: LGTM!

The addition of custom_metrics to the attributes list is appropriate and maintains consistency.

fiftyone/utils/eval/coco.py (2)

70-71: LGTM!

The documentation for custom_metrics parameter maintains consistency with other evaluation configs.


92-97: LGTM!

The custom_metrics parameter is correctly passed to the superclass constructor.

fiftyone/utils/eval/openimages.py (2)

72-73: LGTM!

The documentation for custom_metrics parameter maintains consistency with other evaluation configs.


97-102: LGTM!

The custom_metrics parameter is correctly passed to the superclass constructor.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
fiftyone/operators/evaluation_metric.py (3)

25-37: Add type hints to improve code maintainability.

Consider adding type hints to the configuration parameters for better code maintainability and IDE support.

     def __init__(
         self,
-        name,
-        label=None,
-        description=None,
-        eval_types=None,
-        lower_is_better=True,
+        name: str,
+        label: str | None = None,
+        description: str | None = None,
+        eval_types: list[str] | None = None,
+        lower_is_better: bool = True,
         **kwargs,
     ):

43-43: Fix typo in docstring.

There's a typo in "evaluaiton" in the docstring.

-        """Defines any necessary properties to collect this evaluaiton metric's
+        """Defines any necessary properties to collect this evaluation metric's

62-73: Consider using @abstractmethod for required implementation.

Since compute must be implemented by subclasses, consider using the @abstractmethod decorator to enforce this at the type level.

+    from abc import abstractmethod
+
+    @abstractmethod
     def compute(self, samples, results, **kwargs):
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e98cea0 and c656853.

📒 Files selected for processing (8)
  • app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx (3 hunks)
  • fiftyone/operators/__init__.py (1 hunks)
  • fiftyone/operators/evaluation_metric.py (1 hunks)
  • fiftyone/operators/registry.py (4 hunks)
  • fiftyone/utils/eval/base.py (6 hunks)
  • fiftyone/utils/eval/regression.py (18 hunks)
  • plugins/panels/model_evaluation/__init__.py (2 hunks)
  • tests/unittests/operators/registry_tests.py (2 hunks)
👮 Files not reviewed due to content moderation or server errors (4)
  • fiftyone/utils/eval/regression.py
  • plugins/panels/model_evaluation/init.py
  • fiftyone/utils/eval/base.py
  • app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx
🧰 Additional context used
📓 Path-based instructions (1)
app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx (1)

Pattern **/*.{ts,tsx}: Review the Typescript and React code for conformity with best practices in React, Recoil, Graphql, and Typescript. Highlight any deviations.

🪛 Ruff (0.8.2)
fiftyone/utils/eval/regression.py

587-587: Use kwargs.get("custom_metrics") instead of kwargs.get("custom_metrics", None)

Replace kwargs.get("custom_metrics", None) with kwargs.get("custom_metrics")

(SIM910)

fiftyone/utils/eval/base.py

80-80: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


96-96: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


108-108: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

fiftyone/operators/__init__.py

15-15: .evaluation_metric.EvaluationMetricConfig imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)


15-15: .evaluation_metric.EvaluationMetric imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

⏰ Context from checks skipped due to timeout of 90000ms (5)
  • GitHub Check: test / test-python (ubuntu-latest-m, 3.10)
  • GitHub Check: test / test-app
  • GitHub Check: build / build
  • GitHub Check: e2e / test-e2e
  • GitHub Check: build
🔇 Additional comments (5)
tests/unittests/operators/registry_tests.py (2)

12-23: LGTM! Improved readability with better indentation.

The restructured mock setup is now more readable and maintainable.


55-56: LGTM! Consistent string quotes.

The change to double quotes maintains consistency with the project's style.

fiftyone/operators/evaluation_metric.py (1)

89-109: LGTM! Robust field renaming implementation.

The rename method handles both sample and frame fields correctly, with proper error handling.

fiftyone/operators/registry.py (2)

111-114: LGTM! Robust type handling implementation.

The implementation properly handles both string-based type names and actual class types, making the API more flexible.


46-47: LGTM! Improved documentation clarity.

The docstring updates clearly explain the enhanced type filtering capabilities.

Also applies to: 91-92

fiftyone/operators/__init__.py Show resolved Hide resolved
@@ -146,6 +146,12 @@ def get_mar(self, results):
except Exception as e:
return None

def get_custom_metrics(self, results):
try:
return results.custom_metrics
Copy link
Contributor

@manushreegangwar manushreegangwar Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are still missing metric_property that @ritch suggested: #5389 (comment)

results.custom_metrics should return a list of CustomMetric objects. Working on adding those on top of these changes.

Copy link
Contributor Author

@brimoor brimoor Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see value in further modifications to this PR. results.custom_metrics is currently JSON in roughly same shape as what @ritch suggested in #5389 (comment), so it achieves the same thing, just without the class interface. That seems fine to me as results.custom_metrics is not intended for direct usage by SDK users. It's just an internal data structure to support the Model Evaluation panel.

Note that I refactored so that custom metrics are included in results.metrics() and results.print_metrics(). So I would argue those are the only things that SDK users should need to care about.

Also note that introducing a class might be complex as results need to be serializable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferring to @ritch for the app side changes.

Looks good to me on the SDK side. I am okay with merging this without the modifications proposed for the app.

Copy link
Contributor

@ritch ritch Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see value in further modifications to this PR. results.custom_metrics is currently JSON in roughly same shape as what @ritch suggested in #5389 (comment), so it achieves the same thing, just without the class interface.

If devs end up using this feature we can add in the ability to define a fiftyone.operator.types.Property and use that to render the value. -1 for an adhoc way to define labels, tooltips, type info, min/max values, max number of digits to display etc. That should all be defined via an operator property. That type system already supports a lot of these and is where we should continue to implement this sort of thing (metadata for UI).

Copy link
Contributor

@manushreegangwar manushreegangwar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Ran evaluate_detections (app + SDK) and evaluate_regression (SDK) with and without custom metrics.

fiftyone/operators/evaluation_metric.py Outdated Show resolved Hide resolved
ritch
ritch previously approved these changes Jan 22, 2025
@brimoor brimoor dismissed stale reviews from ritch and manushreegangwar via cab21ff January 22, 2025 18:09
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
fiftyone/utils/eval/base.py (3)

54-81: Consider moving the initialization of custom_metrics outside the loop.

While the implementation is solid, initializing results.custom_metrics inside the loop for each successful computation is inefficient. Consider initializing it once before the loop if any custom metrics are configured.

 def compute_custom_metrics(self, samples, eval_key, results):
+    if self._get_custom_metrics():
+        results.custom_metrics = {}
     for metric, kwargs in self._get_custom_metrics().items():
         try:
             operator = foo.get_operator(metric)
             value = operator.compute(samples, results, **kwargs or {})
             if value is not None:
-                if results.custom_metrics is None:
-                    results.custom_metrics = {}
                 key = operator.config.aggregate_key

85-85: Optimize dictionary key lookup operations.

Replace .keys() with direct dictionary membership testing for better performance.

-for metric in self._get_custom_metrics().keys():
+for metric in self._get_custom_metrics():

Also applies to: 101-101, 113-113

🧰 Tools
🪛 Ruff (0.8.2)

85-85: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


179-180: Rename unused loop variable for better code clarity.

The loop variable uri is not used within the loop body.

-            for uri, metric in self.custom_metrics.items():
+            for _, metric in self.custom_metrics.items():
🧰 Tools
🪛 Ruff (0.8.2)

179-179: Loop control variable uri not used within loop body

Rename unused uri to _uri

(B007)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cab21ff and e05cc17.

📒 Files selected for processing (3)
  • app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx (3 hunks)
  • fiftyone/operators/evaluation_metric.py (1 hunks)
  • fiftyone/utils/eval/base.py (6 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • fiftyone/operators/evaluation_metric.py
🧰 Additional context used
📓 Path-based instructions (1)
app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx (1)

Pattern **/*.{ts,tsx}: Review the Typescript and React code for conformity with best practices in React, Recoil, Graphql, and Typescript. Highlight any deviations.

🪛 Ruff (0.8.2)
fiftyone/utils/eval/base.py

85-85: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


101-101: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


113-113: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


179-179: Loop control variable uri not used within loop body

Rename unused uri to _uri

(B007)

🔇 Additional comments (5)
fiftyone/utils/eval/base.py (3)

45-53: LGTM! Clean implementation of custom metrics retrieval.

The method handles both list and dict formats elegantly, with safe access to config attributes.


82-99: LGTM! Well-structured field collection with proper error handling.

The method efficiently collects fields from all operators while gracefully handling any errors that occur during field retrieval.

🧰 Tools
🪛 Ruff (0.8.2)

85-85: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


100-111: LGTM! Consistent error handling in rename operation.

The implementation follows the established pattern of safe operator access and error handling.

🧰 Tools
🪛 Ruff (0.8.2)

101-101: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx (2)

1762-1782: LGTM! Well-structured TypeScript type definitions.

The types are clearly defined with appropriate properties and good use of TypeScript features.


1784-1811: LGTM! Robust implementation of custom metrics formatting.

The function safely handles null values, uses proper type annotations, and follows best practices for object property access using lodash.

@brimoor brimoor merged commit e7f6d33 into release/v1.3.0 Jan 23, 2025
12 of 14 checks passed
@brimoor brimoor deleted the custom-metrics branch January 23, 2025 17:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Work on a feature request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants