Add support for custom evaluation metrics #5279

brimoor · 2024-12-16T14:57:16Z

Summary by CodeRabbit

New Features
- Added support for custom metrics across various evaluation methods in FiftyOne.
- Enhanced evaluation configurations to include custom metric specifications.
- Introduced new base classes for more flexible evaluation methods.
- Integrated custom metrics into the evaluation panel for better display of results.
Improvements
- Updated evaluation result classes to handle custom metrics.
- Expanded evaluation framework to support more dynamic metric computation.
- Added utility methods for managing custom metrics in different evaluation contexts.
Technical Enhancements
- Refactored evaluation method classes to improve modularity.
- Added type safety and error handling for custom metric processing.
- Improved logging and reporting of evaluation results.

coderabbitai · 2024-12-16T14:57:22Z

Walkthrough

This pull request introduces a comprehensive enhancement to the evaluation metrics framework across multiple files in the FiftyOne project. The changes primarily focus on adding support for custom metrics in various evaluation contexts, including classification, detection, segmentation, and regression. A new base infrastructure is established to handle custom metrics, with modifications to configuration, evaluation, and results classes to accommodate this functionality. The changes provide more flexibility for users to define and compute their own metrics during model evaluation.

Changes

File	Change Summary
`app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx`	Added `formatCustomMetricRows` function, new type definitions for `CustomMetric`, `CustomMetrics`, and `SummaryRow`
`fiftyone/operators/__init__.py`	Added imports for `EvaluationMetricConfig` and `EvaluationMetric`
`fiftyone/operators/evaluation_metric.py`	Introduced new classes `EvaluationMetricConfig` and `EvaluationMetric` with comprehensive metric handling methods
`fiftyone/utils/eval/*`	Added `custom_metrics` parameter to multiple evaluation classes and methods across classification, detection, regression, segmentation, and other evaluation utilities
`plugins/panels/model_evaluation/__init__.py`	Added `get_custom_metrics` method to `EvaluationPanel`
`tests/unittests/operators/registry_tests.py`	Removed `test_list_invalid_type_raises_error` method and made formatting adjustments

Sequence Diagram

sequenceDiagram
    participant User
    participant EvaluationConfig
    participant EvaluationMethod
    participant Results
    
    User->>EvaluationConfig: Specify custom metrics
    EvaluationConfig->>EvaluationMethod: Pass custom metrics
    EvaluationMethod->>EvaluationMethod: Compute standard metrics
    EvaluationMethod->>EvaluationMethod: Compute custom metrics
    EvaluationMethod->>Results: Generate results with custom metrics
    Results-->>User: Return evaluation results

Possibly related PRs

Add COCO-styled mAR to evaluate_detections #5274: This PR adds the mean Average Recall (mAR) metric to the evaluation component, which is directly related to the changes made in the main PR that introduced new metrics and enhanced evaluation logic.
Adds correct/incorrect rows to summary card for classification/non-binary evaluations #5388: This PR introduces correct and incorrect prediction rows to the evaluation summary, enhancing the evaluation metrics displayed, which aligns with the changes made in the main PR regarding the evaluation component's functionality.

Suggested labels

enhancement

Suggested reviewers

imanjra
ritch
minhtuev

Poem

🐰 Metrics dance, custom and bright,
Evaluation's new flexible might!
Code leaps with metrics galore,
Revealing insights never seen before.
Hop, hop, hooray for metric delight! 🎉

✨ Finishing Touches

📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

fiftyone/utils/eval/base.py

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (9)

fiftyone/operators/evaluation_metric.py (1)
43-43: Fix the typo in the docstring.

There's a typo in the word "evaluation" in the docstring of the get_parameters method.

Apply this diff to correct the typo:
-        """Defines any necessary properties to collect this evaluaiton metric's
+        """Defines any necessary properties to collect this evaluation metric's
fiftyone/utils/eval/regression.py (1)
587-587: Simplify the usage of dict.get.

The default value None in kwargs.get("custom_metrics", None) is unnecessary, as None is the default return value of dict.get when the key is missing. You can simplify the code by removing it.

Apply this diff:
     def _parse_config(pred_field, gt_field, method, **kwargs):
         if method is None:
             method = fo.evaluation_config.default_regression_backend

-        custom_metrics = kwargs.get("custom_metrics", None)
+        custom_metrics = kwargs.get("custom_metrics")
         if etau.is_str(custom_metrics):
             kwargs["custom_metrics"] = [custom_metrics]
🧰 Tools

🪛 Ruff (0.8.2)

587-587: Use kwargs.get("custom_metrics") instead of kwargs.get("custom_metrics", None)

Replace kwargs.get("custom_metrics", None) with kwargs.get("custom_metrics")

(SIM910)
fiftyone/utils/eval/base.py (2)
66-68: Access lower_is_better directly from the config.

Instead of accessing lower_is_better through operator.config.kwargs.get(...), you should access it directly as an attribute of operator.config since it's already defined in the EvaluationMetricConfig.

Apply this diff:
                         results.custom_metrics[operator.uri] = {
                             "value": value,
                             "label": operator.config.label,
-                            "lower_is_better": operator.config.kwargs.get(
-                                "lower_is_better", True
-                            ),
+                            "lower_is_better": operator.config.lower_is_better,
                         }
80-80: Simplify iteration over dictionary keys.

When iterating over a dictionary, you can iterate directly without calling .keys(), as this is the default behavior in Python. This makes the code more concise and Pythonic.

Apply these diffs:

At line 80:
         def get_custom_metric_fields(self, samples, eval_key):
             fields = []

-            for metric in self._get_custom_metrics().keys():
+            for metric in self._get_custom_metrics():
At line 96:
         def rename_custom_metrics(self, samples, eval_key, new_eval_key):
-            for metric in self._get_custom_metrics().keys():
+            for metric in self._get_custom_metrics():
At line 108:
         def cleanup_custom_metrics(self, samples, eval_key):
-            for metric in self._get_custom_metrics().keys():
+            for metric in self._get_custom_metrics():
Also applies to: 96-96, 108-108

🧰 Tools

🪛 Ruff (0.8.2)

80-80: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)
fiftyone/utils/eval/detection.py (1)
641-641: Simplify usage of kwargs.get()

You can remove the default value None since kwargs.get("custom_metrics") returns None by default.

Apply this diff to simplify the code:
-    custom_metrics = kwargs.get("custom_metrics", None)
+    custom_metrics = kwargs.get("custom_metrics")
🧰 Tools

🪛 Ruff (0.8.2)

641-641: Use kwargs.get("custom_metrics") instead of kwargs.get("custom_metrics", None)

Replace kwargs.get("custom_metrics", None) with kwargs.get("custom_metrics")

(SIM910)
fiftyone/utils/eval/classification.py (1)
926-926: Simplify usage of kwargs.get()

You can remove the default value None since kwargs.get("custom_metrics") returns None by default.

Apply this diff to simplify the code:
-    custom_metrics = kwargs.get("custom_metrics", None)
+    custom_metrics = kwargs.get("custom_metrics")
🧰 Tools

🪛 Ruff (0.8.2)

926-926: Use kwargs.get("custom_metrics") instead of kwargs.get("custom_metrics", None)

Replace kwargs.get("custom_metrics", None) with kwargs.get("custom_metrics")

(SIM910)
app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx (1)
1-1: Use optional chaining instead of lodash get to simplify code

Using optional chaining and nullish coalescing operators can replace _.get, reducing dependencies and simplifying the code.

Apply this diff to refactor the code:
- import _ from "lodash";
+ // lodash import is no longer needed

...

- const customMetrics = _.get(
-   evaluationMetrics,
-   "custom_metrics",
-   {}
- ) as CustomMetrics;
+ const customMetrics = (evaluationMetrics.custom_metrics ?? {}) as CustomMetrics;

...

-     const compareValue = _.get(
-       comparisonMetrics,
-       `custom_metrics.${operatorUri}.value`,
-       null
-     );
+     const compareValue = comparisonMetrics?.custom_metrics?.[operatorUri]?.value ?? null;
Also applies to: 1785-1795
fiftyone/operators/__init__.py (1)
15-15: Add imported classes to __all__ to resolve unused import warnings.

The imported classes EvaluationMetricConfig and EvaluationMetric should be added to __all__ since they are intended to be part of the public API.
-from .evaluation_metric import EvaluationMetricConfig, EvaluationMetric
+from .evaluation_metric import EvaluationMetricConfig, EvaluationMetric
+
+__all__ = [k for k, v in globals().items() if not k.startswith("_")] + [
+    "EvaluationMetricConfig",
+    "EvaluationMetric",
+]
🧰 Tools

🪛 Ruff (0.8.2)

15-15: .evaluation_metric.EvaluationMetricConfig imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

15-15: .evaluation_metric.EvaluationMetric imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)
fiftyone/utils/eval/segmentation.py (1)
564-567: Optimize the kwargs.get() call.

The None default value in kwargs.get("custom_metrics", None) is redundant as it's the default return value of dict.get().
-    custom_metrics = kwargs.get("custom_metrics", None)
+    custom_metrics = kwargs.get("custom_metrics")
🧰 Tools

🪛 Ruff (0.8.2)

564-564: Use kwargs.get("custom_metrics") instead of kwargs.get("custom_metrics", None)

Replace kwargs.get("custom_metrics", None) with kwargs.get("custom_metrics")

(SIM910)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d228b88 and e98cea0.

📒 Files selected for processing (13)

app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx (3 hunks)
fiftyone/operators/__init__.py (1 hunks)
fiftyone/operators/evaluation_metric.py (1 hunks)
fiftyone/operators/registry.py (4 hunks)
fiftyone/utils/eval/activitynet.py (5 hunks)
fiftyone/utils/eval/base.py (6 hunks)
fiftyone/utils/eval/classification.py (20 hunks)
fiftyone/utils/eval/coco.py (5 hunks)
fiftyone/utils/eval/detection.py (16 hunks)
fiftyone/utils/eval/openimages.py (5 hunks)
fiftyone/utils/eval/regression.py (18 hunks)
fiftyone/utils/eval/segmentation.py (17 hunks)
plugins/panels/model_evaluation/__init__.py (2 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx (1)

Pattern **/*.{ts,tsx}: Review the Typescript and React code for conformity with best practices in React, Recoil, Graphql, and Typescript. Highlight any deviations.

🪛 Ruff (0.8.2)

fiftyone/operators/__init__.py

15-15: .evaluation_metric.EvaluationMetricConfig imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

15-15: .evaluation_metric.EvaluationMetric imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

fiftyone/utils/eval/detection.py

641-641: Use kwargs.get("custom_metrics") instead of kwargs.get("custom_metrics", None)

Replace kwargs.get("custom_metrics", None) with kwargs.get("custom_metrics")

(SIM910)

fiftyone/utils/eval/segmentation.py

564-564: Use kwargs.get("custom_metrics") instead of kwargs.get("custom_metrics", None)

Replace kwargs.get("custom_metrics", None) with kwargs.get("custom_metrics")

(SIM910)

fiftyone/utils/eval/base.py

80-80: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

96-96: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

108-108: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

fiftyone/utils/eval/classification.py

926-926: Use kwargs.get("custom_metrics") instead of kwargs.get("custom_metrics", None)

Replace kwargs.get("custom_metrics", None) with kwargs.get("custom_metrics")

(SIM910)

fiftyone/utils/eval/regression.py

587-587: Use kwargs.get("custom_metrics") instead of kwargs.get("custom_metrics", None)

Replace kwargs.get("custom_metrics", None) with kwargs.get("custom_metrics")

(SIM910)

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: build

🔇 Additional comments (11)

fiftyone/operators/registry.py (1)

9-10: LGTM! The changes enhance operator filtering capabilities.

The implementation correctly handles both string and class type parameters, with proper type conversion and error handling.

Also applies to: 46-47, 91-92, 111-114

plugins/panels/model_evaluation/__init__.py (1)

149-154: LGTM! The custom metrics integration is well-implemented.

The implementation includes proper error handling and seamlessly integrates with the existing evaluation data structure.

Also applies to: 373-373

fiftyone/utils/eval/activitynet.py (1)

43-44: LGTM! Custom metrics support is properly integrated.

The implementation includes clear documentation and correctly propagates custom metrics to the parent class.

Also applies to: 55-55, 59-64

fiftyone/utils/eval/segmentation.py (4)

41-42: LGTM!

The custom_metrics parameter is well-placed and properly documented.

145-146: LGTM!

The documentation for custom_metrics parameter is clear and follows the established format.

149-161: LGTM!

The custom_metrics parameter is properly integrated into the constructor and correctly initialized through the superclass.

514-520: LGTM!

The addition of custom_metrics to the attributes list is appropriate and maintains consistency.

fiftyone/utils/eval/coco.py (2)

70-71: LGTM!

The documentation for custom_metrics parameter maintains consistency with other evaluation configs.

92-97: LGTM!

The custom_metrics parameter is correctly passed to the superclass constructor.

fiftyone/utils/eval/openimages.py (2)

72-73: LGTM!

The documentation for custom_metrics parameter maintains consistency with other evaluation configs.

97-102: LGTM!

The custom_metrics parameter is correctly passed to the superclass constructor.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (3)

fiftyone/operators/evaluation_metric.py (3)
25-37: Add type hints to improve code maintainability.

Consider adding type hints to the configuration parameters for better code maintainability and IDE support.
     def __init__(
         self,
-        name,
-        label=None,
-        description=None,
-        eval_types=None,
-        lower_is_better=True,
+        name: str,
+        label: str | None = None,
+        description: str | None = None,
+        eval_types: list[str] | None = None,
+        lower_is_better: bool = True,
         **kwargs,
     ):
43-43: Fix typo in docstring.

There's a typo in "evaluaiton" in the docstring.
-        """Defines any necessary properties to collect this evaluaiton metric's
+        """Defines any necessary properties to collect this evaluation metric's
62-73: Consider using @abstractmethod for required implementation.

Since compute must be implemented by subclasses, consider using the @abstractmethod decorator to enforce this at the type level.
+    from abc import abstractmethod
+
+    @abstractmethod
     def compute(self, samples, results, **kwargs):

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e98cea0 and c656853.

📒 Files selected for processing (8)

app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx (3 hunks)
fiftyone/operators/__init__.py (1 hunks)
fiftyone/operators/evaluation_metric.py (1 hunks)
fiftyone/operators/registry.py (4 hunks)
fiftyone/utils/eval/base.py (6 hunks)
fiftyone/utils/eval/regression.py (18 hunks)
plugins/panels/model_evaluation/__init__.py (2 hunks)
tests/unittests/operators/registry_tests.py (2 hunks)

👮 Files not reviewed due to content moderation or server errors (4)

fiftyone/utils/eval/regression.py
plugins/panels/model_evaluation/init.py
fiftyone/utils/eval/base.py
app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx

🧰 Additional context used

📓 Path-based instructions (1)

app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx (1)

Pattern **/*.{ts,tsx}: Review the Typescript and React code for conformity with best practices in React, Recoil, Graphql, and Typescript. Highlight any deviations.

🪛 Ruff (0.8.2)

fiftyone/utils/eval/regression.py

587-587: Use kwargs.get("custom_metrics") instead of kwargs.get("custom_metrics", None)

Replace kwargs.get("custom_metrics", None) with kwargs.get("custom_metrics")

(SIM910)

fiftyone/utils/eval/base.py

80-80: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

96-96: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

108-108: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

fiftyone/operators/__init__.py

15-15: .evaluation_metric.EvaluationMetricConfig imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

15-15: .evaluation_metric.EvaluationMetric imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

⏰ Context from checks skipped due to timeout of 90000ms (5)

GitHub Check: test / test-python (ubuntu-latest-m, 3.10)
GitHub Check: test / test-app
GitHub Check: build / build
GitHub Check: e2e / test-e2e
GitHub Check: build

🔇 Additional comments (5)

tests/unittests/operators/registry_tests.py (2)

12-23: LGTM! Improved readability with better indentation.

The restructured mock setup is now more readable and maintainable.

55-56: LGTM! Consistent string quotes.

The change to double quotes maintains consistency with the project's style.

fiftyone/operators/evaluation_metric.py (1)

89-109: LGTM! Robust field renaming implementation.

The rename method handles both sample and frame fields correctly, with proper error handling.

fiftyone/operators/registry.py (2)

111-114: LGTM! Robust type handling implementation.

The implementation properly handles both string-based type names and actual class types, making the API more flexible.

46-47: LGTM! Improved documentation clarity.

The docstring updates clearly explain the enhanced type filtering capabilities.

Also applies to: 91-92

fiftyone/operators/__init__.py

manushreegangwar · 2025-01-22T06:16:55Z

plugins/panels/model_evaluation/__init__.py

@@ -146,6 +146,12 @@ def get_mar(self, results):
        except Exception as e:
            return None

+    def get_custom_metrics(self, results):
+        try:
+            return results.custom_metrics


We are still missing metric_property that @ritch suggested: #5389 (comment)

results.custom_metrics should return a list of CustomMetric objects. Working on adding those on top of these changes.

I don't see value in further modifications to this PR. results.custom_metrics is currently JSON in roughly same shape as what @ritch suggested in #5389 (comment), so it achieves the same thing, just without the class interface. That seems fine to me as results.custom_metrics is not intended for direct usage by SDK users. It's just an internal data structure to support the Model Evaluation panel.

Note that I refactored so that custom metrics are included in results.metrics() and results.print_metrics(). So I would argue those are the only things that SDK users should need to care about.

Also note that introducing a class might be complex as results need to be serializable.

Deferring to @ritch for the app side changes.

Looks good to me on the SDK side. I am okay with merging this without the modifications proposed for the app.

I don't see value in further modifications to this PR. results.custom_metrics is currently JSON in roughly same shape as what @ritch suggested in #5389 (comment), so it achieves the same thing, just without the class interface.

If devs end up using this feature we can add in the ability to define a fiftyone.operator.types.Property and use that to render the value. -1 for an adhoc way to define labels, tooltips, type info, min/max values, max number of digits to display etc. That should all be defined via an operator property. That type system already supports a lot of these and is where we should continue to implement this sort of thing (metadata for UI).

manushreegangwar

LGTM.

Ran evaluate_detections (app + SDK) and evaluate_regression (SDK) with and without custom metrics.

fiftyone/operators/evaluation_metric.py

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

fiftyone/utils/eval/base.py (3)
54-81: Consider moving the initialization of custom_metrics outside the loop.

While the implementation is solid, initializing results.custom_metrics inside the loop for each successful computation is inefficient. Consider initializing it once before the loop if any custom metrics are configured.
 def compute_custom_metrics(self, samples, eval_key, results):
+    if self._get_custom_metrics():
+        results.custom_metrics = {}
     for metric, kwargs in self._get_custom_metrics().items():
         try:
             operator = foo.get_operator(metric)
             value = operator.compute(samples, results, **kwargs or {})
             if value is not None:
-                if results.custom_metrics is None:
-                    results.custom_metrics = {}
                 key = operator.config.aggregate_key
85-85: Optimize dictionary key lookup operations.

Replace .keys() with direct dictionary membership testing for better performance.
-for metric in self._get_custom_metrics().keys():
+for metric in self._get_custom_metrics():
Also applies to: 101-101, 113-113

🧰 Tools

🪛 Ruff (0.8.2)

85-85: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

179-180: Rename unused loop variable for better code clarity.

The loop variable uri is not used within the loop body.
-            for uri, metric in self.custom_metrics.items():
+            for _, metric in self.custom_metrics.items():
🧰 Tools

🪛 Ruff (0.8.2)

179-179: Loop control variable uri not used within loop body

Rename unused uri to _uri

(B007)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cab21ff and e05cc17.

📒 Files selected for processing (3)

app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx (3 hunks)
fiftyone/operators/evaluation_metric.py (1 hunks)
fiftyone/utils/eval/base.py (6 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

fiftyone/operators/evaluation_metric.py

🧰 Additional context used

📓 Path-based instructions (1)

app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx (1)

Pattern **/*.{ts,tsx}: Review the Typescript and React code for conformity with best practices in React, Recoil, Graphql, and Typescript. Highlight any deviations.

🪛 Ruff (0.8.2)

fiftyone/utils/eval/base.py

85-85: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

101-101: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

113-113: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

179-179: Loop control variable uri not used within loop body

Rename unused uri to _uri

(B007)

🔇 Additional comments (5)

fiftyone/utils/eval/base.py (3)

45-53: LGTM! Clean implementation of custom metrics retrieval.

The method handles both list and dict formats elegantly, with safe access to config attributes.

82-99: LGTM! Well-structured field collection with proper error handling.

The method efficiently collects fields from all operators while gracefully handling any errors that occur during field retrieval.

🧰 Tools

🪛 Ruff (0.8.2)

85-85: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

100-111: LGTM! Consistent error handling in rename operation.

The implementation follows the established pattern of safe operator access and error handling.

🧰 Tools

🪛 Ruff (0.8.2)

101-101: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

app/packages/core/src/plugins/SchemaIO/components/NativeModelEvaluationView/Evaluation.tsx (2)

1762-1782: LGTM! Well-structured TypeScript type definitions.

The types are clearly defined with appropriate properties and good use of TypeScript features.

1784-1811: LGTM! Robust implementation of custom metrics formatting.

The function safely handles null values, uses proper type annotations, and follows best practices for object property access using lodash.

fiftyone/utils/eval/base.py

brimoor added the feature Work on a feature request label Dec 16, 2024

brimoor requested a review from manushreegangwar December 16, 2024 14:57

brimoor mentioned this pull request Dec 16, 2024

Add support for custom evaluation metrics voxel51/fiftyone-plugins#190

Merged

brimoor changed the title ~~Add support for custom evaluation metrics~~ [WIP] Add support for custom evaluation metrics Dec 18, 2024

Base automatically changed from release/v1.2.0 to main December 20, 2024 22:41

brimoor force-pushed the custom-metrics branch from c4883ea to 34030d4 Compare December 21, 2024 01:10

brimoor changed the base branch from main to develop December 21, 2024 14:41

brimoor force-pushed the custom-metrics branch from 34030d4 to 6513247 Compare December 21, 2024 14:41

brimoor changed the title ~~[WIP] Add support for custom evaluation metrics~~ Add support for custom evaluation metrics Dec 21, 2024

brimoor force-pushed the custom-metrics branch 2 times, most recently from ea26a26 to 4fa72a0 Compare December 21, 2024 17:21

manushreegangwar mentioned this pull request Jan 10, 2025

[Proposal] Evaluating regression with metric operators #5234

Closed

7 tasks

manushreegangwar reviewed Jan 15, 2025

View reviewed changes

fiftyone/utils/eval/base.py Outdated Show resolved Hide resolved

brimoor mentioned this pull request Jan 16, 2025

Add custom metric to Summary in Model Evaluation panel #5389

Merged

7 tasks

brimoor and others added 13 commits January 20, 2025 21:53

adding support for custom evaluation metrics

37cf9f5

Add custom metric to Summary in Model Evaluation panel

40f2dd3

PR feedback

c68b97e

Updates for custom_metrics display

3920bb8

remove json

84d65d9

ts updates

17199a6

rework custom metrics in summary rows

865bfa7

fix hidden custom metrics

0370906

fix typo

ecee790

Add lower_is_better

49d493d

remove additional_metrics property

e649e2a

lazy initialize

740f5e6

revert results.custom_metrics_config, add results.custom_metrics_report

9047d07

brimoor force-pushed the custom-metrics branch from 8a3a97b to 9047d07 Compare January 21, 2025 02:54

brimoor changed the base branch from develop to release/v1.3.0 January 21, 2025 03:17

brimoor requested a review from ritch January 21, 2025 05:10

brimoor marked this pull request as ready for review January 21, 2025 05:10

brimoor requested a review from manushreegangwar January 21, 2025 05:14

coderabbitai bot reviewed Jan 21, 2025

View reviewed changes

finalizing

c656853

brimoor force-pushed the custom-metrics branch from e98cea0 to c656853 Compare January 21, 2025 05:26

coderabbitai bot reviewed Jan 21, 2025

View reviewed changes

fiftyone/operators/__init__.py Show resolved Hide resolved

manushreegangwar reviewed Jan 22, 2025

View reviewed changes

manushreegangwar requested review from mwoodson1 and jacobsela January 22, 2025 09:24

manushreegangwar previously approved these changes Jan 22, 2025

View reviewed changes

fiftyone/operators/evaluation_metric.py Outdated Show resolved Hide resolved

ritch previously approved these changes Jan 22, 2025

View reviewed changes

typo

cab21ff

brimoor dismissed stale reviews from ritch and manushreegangwar via cab21ff January 22, 2025 18:09

allow customizing aggregate_key

e05cc17

coderabbitai bot reviewed Jan 23, 2025

View reviewed changes

manushreegangwar approved these changes Jan 23, 2025

View reviewed changes

fiftyone/utils/eval/base.py Show resolved Hide resolved

manushreegangwar reviewed Jan 23, 2025

View reviewed changes

fiftyone/utils/eval/base.py Show resolved Hide resolved

brimoor merged commit e7f6d33 into release/v1.3.0 Jan 23, 2025
12 of 14 checks passed

brimoor deleted the custom-metrics branch January 23, 2025 17:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for custom evaluation metrics #5279

Add support for custom evaluation metrics #5279

brimoor commented Dec 16, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 16, 2024 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot left a comment

manushreegangwar Jan 22, 2025 •

edited

Loading

brimoor Jan 22, 2025 •

edited

Loading

manushreegangwar Jan 22, 2025

ritch Jan 22, 2025 •

edited

Loading

manushreegangwar left a comment

coderabbitai bot left a comment

Add support for custom evaluation metrics #5279

Add support for custom evaluation metrics #5279

Conversation

brimoor commented Dec 16, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

coderabbitai bot commented Dec 16, 2024 • edited Loading

Walkthrough

Changes

Sequence Diagram

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

manushreegangwar Jan 22, 2025 • edited Loading

Choose a reason for hiding this comment

brimoor Jan 22, 2025 • edited Loading

Choose a reason for hiding this comment

manushreegangwar Jan 22, 2025

Choose a reason for hiding this comment

ritch Jan 22, 2025 • edited Loading

Choose a reason for hiding this comment

manushreegangwar left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

brimoor commented Dec 16, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 16, 2024 •

edited

Loading

manushreegangwar Jan 22, 2025 •

edited

Loading

brimoor Jan 22, 2025 •

edited

Loading

ritch Jan 22, 2025 •

edited

Loading