-
Notifications
You must be signed in to change notification settings - Fork 589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for selecting/excluding group slices #5198
base: develop
Are you sure you want to change the base?
Conversation
WalkthroughThis pull request introduces enhancements to the FiftyOne framework's grouped datasets functionality. The changes span multiple files, focusing on improving group slice management, documentation, and dataset view handling. Key modifications include adding methods to exclude group slices, updating the selection of group slices, and enhancing the documentation to provide clearer guidance on working with grouped datasets. The updates aim to provide more flexible and intuitive ways of manipulating and interacting with grouped data collections. Changes
Possibly related PRs
Suggested Labels
Suggested Reviewers
Poem
Tip CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
24359cd
to
73a775b
Compare
73a775b
to
9b8c693
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
fiftyone/core/stages.py (1)
5037-5040
: Consider simplifying the if/else
Use a ternary operator for concise readability.Proposed change:
-if etau.is_str(self._slices): - slices = {self._slices} -else: - slices = set(self._slices) +slices = {self._slices} if etau.is_str(self._slices) else set(self._slices)🧰 Tools
🪛 Ruff (0.8.2)
5037-5040: Use ternary operator
slices = {self._slices} if etau.is_str(self._slices) else set(self._slices)
instead ofif
-else
-blockReplace
if
-else
-block withslices = {self._slices} if etau.is_str(self._slices) else set(self._slices)
(SIM108)
fiftyone/__public__.py (1)
201-201
: Remove or utilize the unused import "ExcludeGroupSlices"
Static analysis indicates that "ExcludeGroupSlices" is imported on line 201 but never used. If this import is genuinely unnecessary, you can safely remove it to keep the codebase clean. Otherwise, consider referencing the symbol to avoid the unused-import warning.🧰 Tools
🪛 Ruff (0.8.2)
201-201:
.core.stages.ExcludeGroupSlices
imported but unusedRemove unused import
(F401)
tests/unittests/group_tests.py (1)
645-764
: Consider extracting common assertion patterns into helper methods.While the test is comprehensive and well-structured, there are repeated assertion patterns that could be extracted into helper methods to improve maintainability and reduce code duplication. For example:
+ def _assert_group_view_state(self, view, expected_len, expected_slices, expected_media_types): + self.assertEqual(len(view), expected_len) + self.assertEqual(view.media_type, "group") + self.assertSetEqual(set(view.group_slices), set(expected_slices)) + self.assertDictEqual(view.group_media_types, expected_media_types) + self.assertIn(view.group_slice, expected_slices) + self.assertIn(view.default_group_slice, expected_slices) def test_select_exclude_slices(self): dataset = _make_group_dataset() # Select slices by name view = dataset.select_group_slices(["left", "right"], flat=False) - self.assertEqual(len(view), 2) - self.assertEqual(view.media_type, "group") - self.assertSetEqual(set(view.group_slices), {"left", "right"}) - self.assertDictEqual( - view.group_media_types, {"left": "image", "right": "image"} - ) - self.assertIn(view.group_slice, ["left", "right"]) - self.assertIn(view.default_group_slice, ["left", "right"]) + self._assert_group_view_state( + view, + expected_len=2, + expected_slices=["left", "right"], + expected_media_types={"left": "image", "right": "image"} + )This would make the test more maintainable and easier to update if the assertions need to change.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
docs/source/user_guide/groups.rst
(3 hunks)fiftyone/__public__.py
(1 hunks)fiftyone/core/collections.py
(6 hunks)fiftyone/core/dataset.py
(2 hunks)fiftyone/core/stages.py
(16 hunks)fiftyone/core/view.py
(13 hunks)tests/unittests/group_tests.py
(2 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
fiftyone/__public__.py
201-201: .core.stages.ExcludeGroupSlices
imported but unused
Remove unused import
(F401)
fiftyone/core/stages.py
5037-5040: Use ternary operator slices = {self._slices} if etau.is_str(self._slices) else set(self._slices)
instead of if
-else
-block
Replace if
-else
-block with slices = {self._slices} if etau.is_str(self._slices) else set(self._slices)
(SIM108)
🔇 Additional comments (41)
fiftyone/core/view.py (15)
165-165
: Looks good!
This logic correctly returns False when the stage flattens groups.
247-247
: No issues found.
This fallback expression is appropriate.
274-274
: Straightforward return statement
This line upholds clarity by returning the group media types’ keys.
284-284
: Minor note
The code cleanly returns the group media types without issues.
294-296
: No issues found
Returning the slice from the grouped dataset if the default slice doesn’t exist is logical.
1563-1563
: No concerns
Context variable "_found_flattened_videos" is introduced without any apparent problems.
1579-1584
: Looks correct
The code checks whether groups are flattened, ensuring correct frame reattachment logic.
1612-1612
: Comment block only
No review comment needed.
1617-1617
: Comment only
No direct code to review.
1651-1652
: Logic is appropriate
Manually attaching frames after group lookup is valid.
1704-1704
: No issues
Correctly setting the group slice.
1817-1825
: Code reads well
This handles updated group media types and adjusts the slice if needed.
1833-1835
: Simple setter
No problems with the group slice assignment.
1939-1949
: All good
The function iterates over stages, updating group media types. No issues.
1976-1976
: Trivial single-line snippet
No further remarks.
fiftyone/core/stages.py (8)
101-112
: Clear docstring
Property explains when the stage does or does not flatten groups.
225-240
: Method is well-defined
Provides group media types or returns None. Looks consistent.
Line range hint 4581-4654
: Documentation only
Explains SelectGroupSlices with examples. No code issues found.
4666-4668
: Docstring lines
No issues with these doc comments.
4932-5082
: Implementation looks coherent
This new class ExcludeGroupSlices is consistent with the codebase.
🧰 Tools
🪛 Ruff (0.8.2)
5037-5040: Use ternary operator slices = {self._slices} if etau.is_str(self._slices) else set(self._slices)
instead of if
-else
-block
Replace if
-else
-block with slices = {self._slices} if etau.is_str(self._slices) else set(self._slices)
(SIM108)
8847-8847
: No actionable code
Just a list containing ExcludeGroupSlices.
8895-8895
: No actionable code
Simple registry addition for ExcludeGroupSlices.
8912-8912
: No actionable code
Again, references the new class in the registry.
fiftyone/core/dataset.py (1)
8494-8496
: Enforce restriction on dynamic grouped views
This check effectively prevents cloning datasets that rely on dynamic grouping logic. By raising an exception here, the code avoids deeper inconsistencies in the pipeline. The implementation is straightforward and properly guards against a feature that isn't supported.
docs/source/user_guide/groups.rst (14)
818-819
: Clarify the usage of flat vs. grouped views
These lines efficiently explain how users can select individual slices from their grouped datasets and flatten them into ungrouped collections. The documentation is concise and appears accurate.
846-846
: Highlighting multi-slice selection
This line clarifies that you can select multiple slices (e.g., "left" and "right"). It is clear and helpful.
885-891
: Emphasize that prior filters don't affect unseen slices
These lines accurately warn that filtering a single slice doesn't automatically carry over to other slices. This is a crucial note for new users.
892-901
: Encourage filtering after flattening
This explanation neatly instructs users to apply filters post-flattening if their goal is to filter slices collectively. It's well-structured.
902-909
: Good demonstration of layering multiple filters
The code snippet clarifies how to refine the flattened collection further with additional filters. No issues noted.
910-911
: Reiterating usage of select_group_slices with flat=False
This reference is consistent with earlier docs. Clear and consistent.
912-912
: Small segue to next section
This line is functioning as a transition. No issues.
917-917
: Loading a grouped view while excluding slices
Providing an example that merges well with the new features. Good clarity.
919-920
: Maintaining grouped media type
Line references demonstrate the dataset remains “group” after excluding a slice. Looks accurate.
922-923
: New subheading for excluding slices
The heading is appropriately placed and clearly states the feature’s purpose.
924-928
: Intro to exclude_group_slices
The explanation is direct and succinct. Perfect for users who need to exclude certain slices.
929-929
: Transition line
No issues found with this bridging text.
934-935
: Excluding a single slice
Well-structured code snippet showcasing removing "center." The example is consistent with the new method.
936-937
: Validation of new slices post exclusion
The pair of assertions is straightforward, verifying the reduced set of slices.
tests/unittests/group_tests.py (1)
591-602
: LGTM! Well-structured schema validation test.
The test effectively validates that schema changes are maintained when selecting active slices, with thorough checks for field removal from both sample and frame schemas.
fiftyone/core/collections.py (2)
4621-4694
: LGTM! Well-implemented group slice exclusion functionality
The new exclude_group_slices()
method provides a clean and intuitive way to exclude specific group slices or media types from a collection. The implementation is robust and the documentation includes clear examples.
Line range hint 6671-6773
: LGTM! Good enhancement to select_group_slices with flat parameter
The addition of the flat
parameter to select_group_slices()
provides more flexibility by allowing users to choose between flattened and grouped views. The implementation maintains backward compatibility by defaulting to flat=True
.
Change log
select_group_slices(..., flat=False)
syntax for selecting group slices w/o flatteningexclude_group_slices()
method for excluding group slices (also w/o flattening, as this is the only behavior that makes sense when excluding group slices in general)TODO
view.group_slices
/view.group_media_types
rather thandataset.group_slices
/dataset.group_media_types
, as the former may no longer always match the latterExample usage
Summary by CodeRabbit
New Features
ExcludeGroupSlices
functionality for filtering group slices.exclude_group_slices
method to theSampleCollection
class for excluding specified group slices.ViewStage
class to control group flattening and media type retrieval.Bug Fixes
Dataset
class.Tests