Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ExecutionStore + MongoDB integration tests and fixes #5095

Merged
merged 13 commits into from
Nov 14, 2024

Conversation

ritch
Copy link
Contributor

@ritch ritch commented Nov 12, 2024

Summary by CodeRabbit

  • New Features

    • Introduced dynamic collection name support in the execution store repository.
    • Added methods for checking key existence and deleting stores globally.
    • Enhanced store creation with optional metadata.
    • New property for accessing metadata in store documents.
  • Bug Fixes

    • Improved error handling in TTL retrieval logic.
  • Tests

    • Comprehensive unit tests added for ExecutionStoreService functionalities.
    • Updated existing tests for consistency and clarity.

Copy link
Contributor

coderabbitai bot commented Nov 12, 2024

Caution

Review failed

The pull request is closed.

Walkthrough

The changes in this pull request involve modifications to several classes and methods within the FiftyOne library, primarily focusing on enhancing the functionality of execution stores. Key updates include the addition of parameters to methods, renaming of existing methods, and the introduction of new methods for managing stores and keys. Additionally, unit tests have been updated for clarity and consistency, ensuring comprehensive coverage of the new functionalities.

Changes

File Path Change Summary
fiftyone/factory/repo_factory.py Updated execution_store_repo method to add collection_name parameter and modified collection retrieval logic.
fiftyone/factory/repos/execution_store.py Enhanced ExecutionStoreRepo with new methods and updated existing ones, including create_store and get_store.
fiftyone/operators/store/models.py Modified KeyDocument and StoreDocument classes; added metadata property to StoreDocument.
fiftyone/operators/store/service.py Updated ExecutionStoreService to include new methods and an instance variable for collection_name.
fiftyone/operators/store/store.py Changed logic in get_ttl method to check for key existence before accessing TTL.
tests/unittests/decorators.py Introduced drop_collection decorator for test management.
tests/unittests/execution_store_service_tests.py Added comprehensive unit tests for ExecutionStoreService, validating various functionalities.
tests/unittests/execution_store_unit_tests.py Removed test_has_store method and updated assertions for consistency with dataset_id.

Possibly related PRs

  • Execution Store #4827: The changes in this PR enhance the ExecutionStoreRepo class, which is directly related to the modifications made in the main PR regarding the execution_store_repo method and its handling of dataset IDs.
  • add dataset link for estores #5062: This PR introduces a cleanup function for execution stores associated with datasets, which aligns with the changes in the main PR that also involve managing execution stores in relation to datasets.
  • Exection store interface cleanup #5086: This PR focuses on cleaning up the execution store interface, which includes modifications to the ExecutionStoreRepo and its methods, directly relating to the changes made in the main PR regarding the execution_store_repo method and its functionality.

Suggested labels

feature, bug

Suggested reviewers

  • brimoor
  • imanjra

🐰 In the fields where data grows,
New stores and keys, how the logic flows!
With tests that check and ensure we’re right,
Our code shines bright, a delightful sight!
So hop along, let’s celebrate,
For changes made, we elevate! 🌼


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@ritch ritch changed the base branch from develop to release/v1.1.0 November 12, 2024 17:35
@ritch ritch marked this pull request as ready for review November 12, 2024 19:18
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Outside diff range and nitpick comments (11)
fiftyone/operators/store/models.py (1)

54-58: Consider adding type assertion for defensive programming.

The implementation looks good and provides a safe way to access the store's metadata. However, for additional safety, consider adding a type assertion to ensure self.value is a dictionary when it's not None.

     @property
     def metadata(self) -> dict[str, Any]:
         """The metadata associated with the store."""
-        return self.value or {}
+        if self.value is not None:
+            assert isinstance(self.value, dict), "Store value must be a dictionary"
+            return self.value
+        return {}
fiftyone/factory/repo_factory.py (1)

66-72: Consider implementing cache management

The repository caching logic works correctly, but the repos dictionary could grow unbounded as new dataset_ids are added. Consider implementing a cache eviction strategy or size limit.

Example implementation:

class RepositoryFactory(object):
    MAX_CACHE_SIZE = 1000  # Adjust based on your needs
    repos = {}

    @staticmethod
    def _evict_cache_if_needed():
        if len(RepositoryFactory.repos) >= RepositoryFactory.MAX_CACHE_SIZE:
            # LRU, FIFO, or other eviction strategy
            RepositoryFactory.repos.pop(next(iter(RepositoryFactory.repos)))
tests/unittests/decorators.py (1)

48-49: Enhance docstring with pattern parameter details

The docstring should document the pattern parameter and its usage.

-    """Decorator that drops all stores from the database before running a test."""
+    """Decorator that drops stores from the database before running a test.
+
+    Args:
+        func: The test function to wrap
+        pattern (str): Unix shell-style wildcard pattern to filter store names.
+            Defaults to "*" which matches all stores.
+    """
tests/unittests/execution_store_service_tests.py (5)

1-7: Enhance module docstring with more details

Consider expanding the docstring to include:

  • Purpose and scope of these tests
  • Brief description of ExecutionStoreService
  • Example usage if applicable
 """
-FiftyOne execution store related unit tests.
+Unit tests for FiftyOne's ExecutionStoreService.
+
+These tests verify the functionality of execution stores, including:
+- Store creation and management
+- Key-value operations
+- Dataset-scoped stores
+- Metadata handling
 
 | Copyright 2017-2024, Voxel51, Inc.
 | `voxel51.com <https://voxel51.com/>`_

17-27: Add docstrings to fixtures

The fixtures would benefit from docstrings explaining their purpose and return values.

 @pytest.fixture
 def svc():
+    """
+    Returns a clean ExecutionStoreService instance for global store testing.
+    
+    Returns:
+        ExecutionStoreService: A service instance without dataset context
+    """
     return ExecutionStoreService()

 @pytest.fixture
 def svc_with_dataset():
+    """
+    Returns an ExecutionStoreService instance with a test dataset context.
+    
+    Yields:
+        ExecutionStoreService: A service instance with dataset context
+    """
     dataset = fo.Dataset(name="test_dataset")
     dataset.save()
     yield ExecutionStoreService(dataset_id=dataset._doc.id)

94-113: Split complex test into smaller, focused tests

The test_list_global_stores function tests multiple scenarios. Consider splitting it into separate test functions for better clarity and maintenance.

-def test_list_global_stores(svc, svc_with_dataset):
+def test_list_global_stores_with_dataset_less_store(svc):
     NO_DATASET_STORE_NAME = "dataset_less_store"
-    DATASET_STORE_NAME = "dataset_store"
-    KEY_ONLY_STORE_NAME = "key_only_store"
 
     svc.create_store(NO_DATASET_STORE_NAME)
+    global_list = svc.list_stores_global()
+    assert len(global_list) == 1
+    assert global_list[0].store_name == NO_DATASET_STORE_NAME
+    assert global_list[0].dataset_id is None
+
+def test_list_global_stores_with_dataset_store(svc_with_dataset):
+    DATASET_STORE_NAME = "dataset_store"
     svc_with_dataset.create_store(DATASET_STORE_NAME)
     svc_with_dataset.set_key(DATASET_STORE_NAME, "key", "value")
-    svc_with_dataset.set_key(KEY_ONLY_STORE_NAME, "key", "value")
 
     global_list = svc.list_stores_global()
-    store_names = [store.store_name for store in global_list]
-    dataset_ids = [store.dataset_id for store in global_list]
-    assert len(global_list) == 3
-    assert NO_DATASET_STORE_NAME in store_names
-    assert DATASET_STORE_NAME in store_names
-    assert KEY_ONLY_STORE_NAME in store_names
-    assert None in dataset_ids
-    assert svc_with_dataset._dataset_id in dataset_ids
+    assert len(global_list) == 1
+    assert global_list[0].store_name == DATASET_STORE_NAME
+    assert global_list[0].dataset_id == svc_with_dataset._dataset_id

31-42: Define test constants at module level

Common test values like store names and metadata could be defined as module-level constants to improve maintainability.

+# Test constants
+TEST_STORE_NAME = "test_store"
+TEST_METADATA = {"test": "value"}
+
 @drop_stores
 @drop_datasets
 def test_store_creation(svc):
-    NAME = "test_store"
-    created_store = svc.create_store(NAME)
+    created_store = svc.create_store(TEST_STORE_NAME)
 
     assert (
-        created_store.store_name == NAME
+        created_store.store_name == TEST_STORE_NAME
     ), "Store name should match the given name"

Also applies to: 61-74


84-89: Improve assertion messages

Some assertions could benefit from more descriptive error messages that explain the expected behavior in more detail.

     assert (
         svc.count_keys(NAME) == 1
-    ), "Store should have 1 key after setting it"
+    ), f"Store '{NAME}' should have exactly 1 key after setting '{KEY}'"
     assert (
         svc.get_key(NAME, KEY).value == VALUE
-    ), "Retrieved value should match the set value"
+    ), f"Retrieved value for key '{KEY}' should be '{VALUE}' but got '{svc.get_key(NAME, KEY).value}'"

Also applies to: 132-134

fiftyone/operators/store/service.py (2)

Line range hint 53-64: Update docstring to include the metadata parameter

The docstring should be updated to document the new metadata parameter.

Apply this diff to update the documentation:

     def create_store(
         self, store_name: str, metadata: Optional[dict[str, Any]] = None
     ) -> StoreDocument:
         """Creates a new store with the specified name.
 
         Args:
             store_name: the name of the store
+            metadata: optional metadata to associate with the store
 
         Returns:
             a :class:`fiftyone.store.models.StoreDocument`
         """

132-139: Add return type to the docstring

The docstring should document the return type of the method.

Apply this diff to complete the documentation:

     def has_key(self, store_name: str, key: str) -> bool:
         """Determines whether the specified key exists in the specified store.
 
         Args:
             store_name: the name of the store
             key: the key to check
+
+        Returns:
+            True if the key exists, False otherwise
         """
fiftyone/factory/repos/execution_store.py (1)

228-249: Ensure aggregation pipeline handles large datasets efficiently

The aggregation pipeline in list_stores_global could potentially process a large number of documents. Consider adding indexes on store_name and dataset_id to optimize the grouping operation.

If not already indexed, apply this diff to create indexes:

 def _create_indexes(self):
     indices = [idx["name"] for idx in self._collection.list_indexes()]
     ...
+    # Ensure indexes for efficient aggregation
+    self._collection.create_index([("store_name", 1), ("dataset_id", 1)], name="store_dataset_index")
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between ff1e70e and 99ab227.

📒 Files selected for processing (8)
  • fiftyone/factory/repo_factory.py (1 hunks)
  • fiftyone/factory/repos/execution_store.py (4 hunks)
  • fiftyone/operators/store/models.py (2 hunks)
  • fiftyone/operators/store/service.py (4 hunks)
  • fiftyone/operators/store/store.py (1 hunks)
  • tests/unittests/decorators.py (2 hunks)
  • tests/unittests/execution_store_service_tests.py (1 hunks)
  • tests/unittests/execution_store_unit_tests.py (0 hunks)
💤 Files with no reviewable changes (1)
  • tests/unittests/execution_store_unit_tests.py
🧰 Additional context used
📓 Learnings (1)
fiftyone/factory/repo_factory.py (1)
Learnt from: brimoor
PR: voxel51/fiftyone#5086
File: fiftyone/factory/repos/execution_store.py:141-143
Timestamp: 2024-11-12T12:36:39.324Z
Learning: In methods like `get_key` in `ExecutionStoreRepo` (file `fiftyone/factory/repos/execution_store.py`), when `self._dataset_id` is `None`, it's important to include `{"dataset_id": None}` in the MongoDB query, as omitting `dataset_id` is not equivalent to including it with `None`.
🔇 Additional comments (10)
fiftyone/operators/store/models.py (1)

44-44: LGTM!

The added empty line improves code readability by properly separating methods, following PEP 8 guidelines.

fiftyone/factory/repo_factory.py (1)

60-72: Implementation looks good!

The changes effectively fix the bug with dataset_ids in cached repos while maintaining clean and maintainable code. The dynamic key generation and repository caching logic are well-implemented.

tests/unittests/decorators.py (2)

11-11: LGTM: Clean and purposeful imports

The new imports are well-chosen and properly used in the new decorator implementation.

Also applies to: 14-14


48-62: Verify decorator usage patterns

Let's verify how this decorator integrates with existing test patterns.

✅ Verification successful

Decorator usage patterns verified

No decorator order conflicts detected. The @drop_* decorators are consistently integrated with the test suite.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check usage patterns of test decorators and potential conflicts

# Find similar decorators in test files
echo "=== Similar decorators ==="
rg -l "def.*drop_.*\(.*\):" tests/

# Find potential usage patterns
echo -e "\n=== Potential usage patterns ==="
rg "@drop_" tests/

# Check for potential decorator order conflicts
echo -e "\n=== Potential decorator conflicts ==="
rg "^@.*\n@drop_|^@drop_\n@.*" tests/

Length of output: 23626


Script:

#!/bin/bash
# Description: Check for potential decorator order conflicts with multiline support

# Check for potential decorator order conflicts with multiline regex
echo "=== Potential decorator conflicts ==="
rg -U "^@.*\n@drop_|^@drop_\n@.*" tests/

Length of output: 1387

fiftyone/operators/store/store.py (1)

113-116: LGTM! Improved error handling and code reuse

The updated implementation is more robust and consistent with the rest of the class by:

  1. Reusing the get_key method, following DRY principles
  2. Providing better error handling by explicitly checking for non-existent keys
  3. Maintaining consistency with other methods like get()
tests/unittests/execution_store_service_tests.py (1)

29-181: LGTM! Comprehensive test coverage

The test suite provides excellent coverage of ExecutionStoreService functionality, including:

  • Store creation and management
  • Key-value operations
  • Dataset scoping
  • Metadata handling
  • Error cases

The tests are well-structured and use appropriate fixtures and cleanup decorators.

fiftyone/operators/store/service.py (3)

50-50: LGTM: Proper initialization of dataset context

The addition of _dataset_id instance variable correctly maintains the dataset context as described in the class documentation.


66-75: LGTM: Well-documented store retrieval method

The new get_store method is properly implemented with complete documentation and correct typing.


234-244: LGTM: Well-implemented global store deletion

The new delete_store_global method provides a clean way to manage stores across all contexts with proper documentation and typing.

fiftyone/factory/repos/execution_store.py (1)

60-62: Initialize StoreDocument with potential missing fields

When raw_store_doc is None and self.has_store(store_name) returns True, the newly created StoreDocument may be missing expected fields like value. Ensure that all necessary fields are properly initialized to avoid potential attribute errors downstream.

Check whether StoreDocument initialization covers all required fields:

Comment on lines 60 to 67
repo_key = (
f"{MongoExecutionStoreRepo.COLLECTION_NAME}_{dataset_id}"
if dataset_id
else MongoExecutionStoreRepo.COLLECTION_NAME
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Dataset ID Formatting Inconsistency Detected

The tests convert dataset_id to string, whereas the repository expects an ObjectId. This inconsistency may lead to issues with key generation and data retrieval.

🔗 Analysis chain

Verify dataset_id format for key generation

The dynamic key generation looks good, but we should ensure that the dataset_id string representation doesn't contain characters that could create invalid or conflicting keys.

Let's check the dataset_id format in the codebase:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check dataset_id format usage across the codebase
# Expected: dataset_id should be ObjectId which ensures safe string representation

# Check for dataset_id type hints and usage
ast-grep --pattern 'dataset_id: ObjectId'

# Check for dataset_id string operations
rg -A 2 'dataset_id.*str\(' 

Length of output: 5901

tests/unittests/decorators.py Outdated Show resolved Hide resolved
fiftyone/factory/repos/execution_store.py Outdated Show resolved Hide resolved
fiftyone/factory/repos/execution_store.py Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)
tests/unittests/execution_store_service_tests.py (1)

247-247: Remove unused variable KEY_A

The variable KEY_A is declared but never used in the test_cleanup function.

-    KEY_A = "key_a"
🧰 Tools
🪛 Ruff

247-247: Local variable KEY_A is assigned to but never used

Remove assignment to unused variable KEY_A

(F841)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 99ab227 and 128e1dc.

📒 Files selected for processing (1)
  • tests/unittests/execution_store_service_tests.py (1 hunks)
🧰 Additional context used
🪛 Ruff
tests/unittests/execution_store_service_tests.py

247-247: Local variable KEY_A is assigned to but never used

Remove assignment to unused variable KEY_A

(F841)

🔇 Additional comments (3)
tests/unittests/execution_store_service_tests.py (3)

1-16: LGTM! Well-structured imports and header

The file header is properly documented with copyright notice, and all necessary imports are included.


17-27: LGTM! Well-designed test fixtures

The fixtures provide good test isolation with both dataset-less and dataset-bound service instances. The cleanup is handled by the @drop_datasets decorator.


29-240: LGTM! Comprehensive test coverage with clear assertions

The test suite provides excellent coverage of ExecutionStoreService functionality including:

  • Store lifecycle management
  • Key operations
  • Scoping behavior
  • TTL handling
  • Edge cases

The tests follow good practices with clear arrange-act-assert patterns and descriptive assertions.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 128e1dc and cfef2bb.

📒 Files selected for processing (3)
  • fiftyone/factory/repos/execution_store.py (5 hunks)
  • fiftyone/operators/store/models.py (2 hunks)
  • tests/unittests/execution_store_service_tests.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • fiftyone/operators/store/models.py
  • tests/unittests/execution_store_service_tests.py
🔇 Additional comments (6)
fiftyone/factory/repos/execution_store.py (6)

89-107: LGTM! Efficient implementation using aggregation pipeline

The change from a simple count to an aggregation pipeline is a good optimization, especially for large datasets. The implementation correctly handles empty results and properly groups by both store name and dataset ID.


170-175: LGTM! Clean and efficient implementation

The method is well-implemented with proper boolean conversion and efficient database query.


241-262: LGTM! Well-structured aggregation pipeline

The change to return StoreDocument instances instead of strings provides more functionality while maintaining efficiency through the aggregation pipeline.


265-279: LGTM! Consistent and efficient implementation

The implementation maintains consistency with count_stores while properly handling global scope and empty results.


281-286: LGTM! Clean implementation of global store deletion

The method correctly implements global store deletion without dataset filtering and provides meaningful feedback through the deletion count.


50-65: 🛠️ Refactor suggestion

Add Optional type hint for nullable return value

The method can return None, but the return type annotation doesn't reflect this. Add Optional type hint for better type safety.

Apply this diff:

+from typing import Any, Dict, Optional

-    def get_store(self, store_name) -> StoreDocument:
+    def get_store(self, store_name) -> Optional[StoreDocument]:

Likely invalid or redundant comment.

fiftyone/factory/repos/execution_store.py Outdated Show resolved Hide resolved
@ritch ritch changed the title ExecutionStore + MongoDB integration tests ExecutionStore + MongoDB integration tests and fixes Nov 12, 2024
@ritch ritch requested review from brimoor and Br2850 November 12, 2024 21:32
@@ -34,23 +35,42 @@ def __init__(self, collection: Collection, dataset_id: ObjectId = None):
self._collection = collection
self._dataset_id = dataset_id

def create_store(self, store_name) -> StoreDocument:
def create_store(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw a PR where @imanjra plans to change this to def store() so just a heads up in case a conflict arises

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (7)
tests/unittests/decorators.py (2)

48-49: Enhance docstring with usage examples

The docstring would be more helpful with examples showing pattern usage:

-    """Decorator that drops all stores from the database before running a test."""
+    """Decorator that drops all stores from the database before running a test.
+    
+    Args:
+        func: The test function to wrap
+        pattern (str, optional): Unix shell-style wildcard pattern for matching store names.
+            Defaults to "*" which matches all stores.
+    
+    Example:
+        @drop_stores
+        def test_basic():
+            # All stores are dropped before this test
+            pass
+            
+        @drop_stores(pattern="test_*")
+        def test_specific():
+            # Only stores starting with "test_" are dropped
+            pass
+    """

51-64: Add debug logging for better test observability

Consider adding debug logging to help diagnose test cleanup issues:

     @wraps(func)
     def wrapper(*args, **kwargs):
         svc = foos.ExecutionStoreService()
+        logger = fo.get_logger()
+        logger.debug("Dropping stores matching pattern: %s", pattern)
         stores = svc.list_stores_global()
         for store in stores:
             store_name = store.store_name
             if fnmatch.fnmatch(store_name, pattern):
+                logger.debug("Dropping store: %s", store_name)
                 try:
                     svc.delete_store_global(store_name)
                 except Exception as e:
+                    logger.error("Failed to drop store: %s", store_name, exc_info=True)
                     raise RuntimeError(
                         f"Failed to delete store '{store_name}'"
                     ) from e
fiftyone/factory/repos/execution_store.py (5)

38-45: Add docstring for the metadata parameter

The metadata parameter should be documented to explain its purpose and expected format.

 def create_store(
     self, store_name, metadata: Dict[str, Any] = None
 ) -> StoreDocument:
-    """Creates a store associated with the current context."""
+    """Creates a store associated with the current context.
+    
+    Args:
+        store_name: Name of the store to create
+        metadata: Optional metadata to associate with the store
+    
+    Returns:
+        StoreDocument: The created store document
+    """
🧰 Tools
🪛 Ruff

39-39: Undefined name Dict

(F821)


50-65: Consider simplifying store retrieval logic

Based on the previous comment, the key="__store__" check in the query might be unnecessary since a store can exist with any key. Consider simplifying the retrieval logic.

 def get_store(self, store_name) -> StoreDocument:
     """Gets a store associated with the current context."""
     raw_store_doc = self._collection.find_one(
         dict(
             store_name=store_name,
-            key="__store__",
             dataset_id=self._dataset_id,
         )
     )

89-107: Add error handling for aggregation pipeline

The aggregation pipeline could fail due to various MongoDB errors. Consider adding error handling.

-    result = list(self._collection.aggregate(pipeline))
-    return result[0]["total_stores"] if result else 0
+    try:
+        result = list(self._collection.aggregate(pipeline))
+        return result[0]["total_stores"] if result else 0
+    except Exception as e:
+        logger.error(f"Failed to count stores: {e}")
+        return 0

170-175: Optimize key existence check

The current implementation retrieves the entire document. Consider using projection to improve performance.

 def has_key(self, store_name, key) -> bool:
     """Determines whether a key exists in the specified store."""
     result = self._collection.find_one(
-        dict(store_name=store_name, key=key, dataset_id=self._dataset_id)
+        dict(store_name=store_name, key=key, dataset_id=self._dataset_id),
+        projection={"_id": 1}
     )
     return bool(result)

241-262: Consider adding pagination to global store listing

For large datasets, retrieving all stores at once could be memory-intensive. Consider adding pagination support.

Example implementation:

def list_stores_global(self, skip: int = 0, limit: int = 100) -> list[StoreDocument]:
    """Lists stores across all datasets and the global context.
    
    Args:
        skip: Number of documents to skip
        limit: Maximum number of documents to return
    """
    pipeline = [
        # ... existing pipeline stages ...
        {"$skip": skip},
        {"$limit": limit}
    ]
    result = self._collection.aggregate(pipeline)
    return [StoreDocument(**d) for d in result]
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between cfef2bb and 0ddaf1b.

📒 Files selected for processing (2)
  • fiftyone/factory/repos/execution_store.py (5 hunks)
  • tests/unittests/decorators.py (2 hunks)
🧰 Additional context used
🪛 Ruff
fiftyone/factory/repos/execution_store.py

39-39: Undefined name Dict

(F821)

🔇 Additional comments (2)
tests/unittests/decorators.py (1)

11-11: LGTM: Import statements are well-organized

The new imports are properly grouped (standard library followed by local imports) and are necessary for the new functionality.

Also applies to: 14-14

fiftyone/factory/repos/execution_store.py (1)

13-13: ⚠️ Potential issue

Add missing Dict import from typing module

The type annotation Dict is used but not imported. Add it to the import statement.

-from typing import Any
+from typing import Any, Dict

Likely invalid or redundant comment.

self._repo: ExecutionStoreRepo = repo

def create_store(self, store_name: str) -> StoreDocument:
def create_store(
Copy link
Contributor

@Br2850 Br2850 Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same create_store comment as before, I don't care which method name is used - I added a try/except for both just in case in DQ panel

Br2850
Br2850 previously approved these changes Nov 12, 2024
Copy link
Contributor

@Br2850 Br2850 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍🏽 only noted the naming of create_store

Copy link
Contributor

@brimoor brimoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on 💪

I think there's an important change needed to @drop_stores before this can be merged!

tests/unittests/execution_store_service_tests.py Outdated Show resolved Hide resolved
tests/unittests/decorators.py Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Outside diff range and nitpick comments (6)
tests/unittests/decorators.py (1)

51-66: Add error handling and logging to the collection drop operations.

The decorator correctly uses a finally block to ensure cleanup, but it should:

  1. Handle errors during collection drops
  2. Log collection operations for debugging

Consider this enhanced implementation:

 def drop_collection(collection_name):
     """Decorator that drops a collection from the database before and after running a test."""
+    import logging
+    logger = logging.getLogger(__name__)
 
     def decorator(func):
         @wraps(func)
         def wrapper(*args, **kwargs):
             db = foo.get_db_conn()
-            db.drop_collection(collection_name)
+            try:
+                db.drop_collection(collection_name)
+                logger.debug(f"Dropped collection '{collection_name}' before test")
+            except Exception as e:
+                logger.warning(f"Failed to drop collection '{collection_name}': {e}")
             try:
                 return func(*args, **kwargs)
             finally:
-                db.drop_collection(collection_name)
+                try:
+                    db.drop_collection(collection_name)
+                    logger.debug(f"Dropped collection '{collection_name}' after test")
+                except Exception as e:
+                    logger.warning(f"Failed to drop collection '{collection_name}': {e}")
 
         return wrapper
 
     return decorator
fiftyone/operators/store/service.py (2)

Line range hint 55-67: Document the metadata parameter in the docstring

The docstring is missing documentation for the new metadata parameter.

Add the metadata parameter to the docstring:

 def create_store(
     self, store_name: str, metadata: Optional[dict[str, Any]] = None
 ) -> StoreDocument:
     """Creates a new store with the specified name.

     Args:
         store_name: the name of the store
+        metadata: optional metadata to associate with the store

134-141: Complete the docstring for has_key method

The docstring is missing the Returns section and return type documentation.

Update the docstring:

 def has_key(self, store_name: str, key: str) -> bool:
     """Determines whether the specified key exists in the specified store.

     Args:
         store_name: the name of the store
         key: the key to check
+
+    Returns:
+        True if the key exists, False otherwise
     """
fiftyone/factory/repos/execution_store.py (1)

241-262: Consider adding error handling for aggregation failures

The implementation looks good, but consider adding error handling for potential aggregation failures, especially since this is a global operation that could process large amounts of data.

     def list_stores_global(self) -> list[StoreDocument]:
         """Lists stores across all datasets and the global context."""
-        pipeline = [
-            {
-                "$group": {
-                    "_id": {
-                        "store_name": "$store_name",
-                        "dataset_id": "$dataset_id",
-                    }
-                }
-            },
-            {
-                "$project": {
-                    "_id": 0,
-                    "store_name": "$_id.store_name",
-                    "dataset_id": "$_id.dataset_id",
-                }
-            },
-        ]
-
-        result = self._collection.aggregate(pipeline)
-        return [StoreDocument(**d) for d in result]
+        try:
+            pipeline = [
+                {
+                    "$group": {
+                        "_id": {
+                            "store_name": "$store_name",
+                            "dataset_id": "$dataset_id",
+                        }
+                    }
+                },
+                {
+                    "$project": {
+                        "_id": 0,
+                        "store_name": "$_id.store_name",
+                        "dataset_id": "$_id.dataset_id",
+                    }
+                },
+            ]
+
+            result = self._collection.aggregate(pipeline)
+            return [StoreDocument(**d) for d in result]
+        except Exception as e:
+            logger.error(f"Failed to list global stores: {e}")
+            return []
fiftyone/operators/types.py (2)

1348-1348: Clarify the operator parameter description

In the Button class, the description for the operator parameter mentions "the uri of the operator to execute when the button is clicked." If the operator parameter accepts both URIs and callables, it's helpful to clarify this in the description to avoid confusion.

Consider updating the parameter description as follows:

-    operator (None): the uri of the operator to execute when the button is clicked
+    operator (None): the operator to execute when the button is clicked, specified by its URI or as a callable

1373-1414: Consider inheriting from Button to promote code reuse

The OperatorExecutionButtonView class shares similar properties and functionality with the Button class. To enhance maintainability and reduce code duplication, consider having OperatorExecutionButtonView inherit from Button instead of directly from View.

Modify the class definition as follows:

-class OperatorExecutionButtonView(View):
+class OperatorExecutionButtonView(Button):

This change allows OperatorExecutionButtonView to inherit existing attributes like operator, params, and prompt, and focus on extending or customizing the behavior specific to this class.

🧰 Tools
🪛 Ruff

1401-1401: Use kwargs.get("operator") instead of kwargs.get("operator", None)

Replace kwargs.get("operator", None) with kwargs.get("operator")

(SIM910)


1403-1403: Use kwargs.get("params") instead of kwargs.get("params", None)

Replace kwargs.get("params", None) with kwargs.get("params")

(SIM910)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 0ddaf1b and ca7c135.

📒 Files selected for processing (6)
  • fiftyone/factory/repo_factory.py (1 hunks)
  • fiftyone/factory/repos/execution_store.py (5 hunks)
  • fiftyone/operators/store/service.py (5 hunks)
  • fiftyone/operators/types.py (2 hunks)
  • tests/unittests/decorators.py (2 hunks)
  • tests/unittests/execution_store_service_tests.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/unittests/execution_store_service_tests.py
🧰 Additional context used
🪛 Ruff
fiftyone/operators/types.py

1401-1401: Use kwargs.get("operator") instead of kwargs.get("operator", None)

Replace kwargs.get("operator", None) with kwargs.get("operator")

(SIM910)


1403-1403: Use kwargs.get("params") instead of kwargs.get("params", None)

Replace kwargs.get("params", None) with kwargs.get("params")

(SIM910)

tests/unittests/decorators.py

11-11: fnmatch imported but unused

Remove unused import: fnmatch

(F401)


48-48: Redefinition of unused wraps from line 8

Remove definition: wraps

(F811)

🔇 Additional comments (10)
fiftyone/factory/repo_factory.py (2)

59-63: LGTM: Dynamic collection name handling looks good

The addition of the optional collection_name parameter and its handling provides good flexibility while maintaining backward compatibility through the default fallback to MongoExecutionStoreRepo.COLLECTION_NAME.


64-67: Removal of repo caching fixes the global state issue

The removal of repo caching and direct instantiation of MongoExecutionStoreRepo is a good change as it fixes the bug where repos were incorrectly considered global in the cache, leading to incorrect dataset_id references.

Let's verify there are no remaining global state issues:

✅ Verification successful

Let me gather more context about the repo factory implementation to ensure there are no remaining global state issues.


Global state is properly handled after the changes

The removal of repo caching for execution_store_repo is indeed the correct fix. The code shows that while there is still a global cache (repos) being used for delegated_operation_repo, this is safe because that repo doesn't depend on dataset-specific state like dataset_id. The execution_store_repo now correctly creates a new instance for each call, avoiding any global state issues with dataset references.

  • delegated_operation_repo: Uses caching but has no dataset-specific state
  • execution_store_repo: No longer cached, creates fresh instances with correct dataset_id
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for potential global state issues with repos
# Expected: No other global caching of repos with dataset_id

# Check for other repo caches that might have similar issues
rg -A 3 'repos\s*=\s*\{' 

# Check for static repo storage
ast-grep --pattern 'class $_ {
  $$$
  repos = {}
  $$$
}'

Length of output: 315


Script:

#!/bin/bash
# Check the full implementation of repo_factory.py to understand the context
cat fiftyone/factory/repo_factory.py

# Look for any other potential static/class variables that could cause global state
ast-grep --pattern 'class $_ {
  $$$
  $_: Dict = {}
  $$$
}'

# Search for any singleton patterns or global instances
rg -l "singleton|instance\s*=\s*None" 

Length of output: 2473

tests/unittests/decorators.py (1)

51-66: Verify test isolation strategy.

Based on the previous review discussion, this decorator implements the agreed-upon approach of using a temporary collection for test isolation. This is a good solution that prevents tests from affecting persistent user data.

Let's verify that this decorator is being used correctly in the test files:

✅ Verification successful

Let me check the value of TEST_COLLECTION_NAME to ensure it follows test isolation best practices.


Test isolation strategy is properly implemented

The decorator is consistently used with a test-specific collection named "execution_store_test_collection" across all test methods in execution_store_service_tests.py. The collection name clearly indicates its test purpose, and the decorator ensures proper cleanup before and after each test by dropping the collection, maintaining test isolation.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if the decorator is used with test-specific collection names
# Expected: Collection names should contain 'test' or be temporary

# Search for usage of the decorator
rg -A 2 '@drop_collection' 

Length of output: 4249


Script:

#!/bin/bash
# Find the definition of TEST_COLLECTION_NAME
rg "TEST_COLLECTION_NAME =" tests/

Length of output: 142

fiftyone/operators/store/service.py (3)

39-39: LGTM: Constructor changes maintain backward compatibility

The addition of the optional collection_name parameter enhances flexibility while maintaining backward compatibility.

Also applies to: 49-50


68-77: LGTM: Well-documented get_store method

The new method provides a clean interface for retrieving store documents with proper documentation and type hints.


236-246: LGTM: Comprehensive global store deletion

The new method properly implements global store deletion with complete documentation and follows the established pattern of global operations.

fiftyone/factory/repos/execution_store.py (4)

13-13: LGTM: Type annotations and metadata handling improvements

The changes properly implement type hints using Dict from typing and correctly handle the optional metadata parameter in create_store.

Also applies to: 38-45


170-175: LGTM: Clean implementation of has_key method

The method efficiently checks for key existence using find_one and properly returns a boolean result.


50-65: 🛠️ Refactor suggestion

Add Optional type annotation for get_store method

The method can return None, but the return type annotation doesn't reflect this. Consider updating the return type to Optional[StoreDocument].

-    def get_store(self, store_name) -> StoreDocument:
+    def get_store(self, store_name) -> Optional[StoreDocument]:

Likely invalid or redundant comment.


89-107: Verify performance impact of aggregation pipeline

The new implementation uses a more complex aggregation pipeline instead of a simple count. While this provides more accurate results by properly grouping stores, it might impact performance for large datasets.

✅ Verification successful

Based on the analysis of the codebase, I can now provide a definitive response about the performance impact of the aggregation pipeline:

Aggregation pipeline change is justified and should not impact performance

The new implementation using aggregation pipeline with $group is actually more accurate as it properly handles duplicate store names within the same dataset. The performance impact should be negligible because:

  1. The collection already has an index on dataset_id as shown by the index creation code in fiftyone/factory/repos/execution_store.py
  2. The aggregation pipeline is simple and only involves a match on an indexed field followed by a group operation
  3. The result set is minimal - just returning a count
  4. The execution store is not expected to have a massive number of records per dataset
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Compare execution time of old vs new implementation
# Note: This is a simplified comparison that doesn't account for all factors

echo "Checking collection size..."
mongo --eval 'db.execution_store.count()'

echo "Checking index usage for the aggregation pipeline..."
mongo --eval 'db.execution_store.aggregate([
  {$match: {dataset_id: null}},
  {$group: {_id: {store_name: "$store_name", dataset_id: "$dataset_id"}}},
  {$count: "total_stores"}
], {explain: true})'

Length of output: 571


Script:

#!/bin/bash
# Let's analyze the codebase to understand the context better
# First, check if there are any existing indexes defined
rg -A 5 "create_index|createIndex|ensure_index|ensureIndex" 

# Look for any performance-related tests or benchmarks
fd -e py -e yaml test
rg -l "performance|benchmark" 

# Check if there are any comments or documentation about performance considerations
rg -i "performance|optimization" 

# Look for similar aggregation patterns in the codebase
ast-grep --pattern 'aggregate($$$)'

Length of output: 109379

tests/unittests/decorators.py Outdated Show resolved Hide resolved
fiftyone/factory/repos/execution_store.py Show resolved Hide resolved
fiftyone/operators/types.py Outdated Show resolved Hide resolved
fiftyone/operators/types.py Outdated Show resolved Hide resolved
@ritch ritch force-pushed the estore-int-tests branch 3 times, most recently from 4cf20d3 to 1206538 Compare November 13, 2024 17:23
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (2)
tests/unittests/decorators.py (1)

13-13: Consider using a more descriptive alias for the ODM import

The alias 'foo' is not descriptive of the module's purpose. Consider using a more meaningful alias like 'odm' or 'db_odm' to improve code readability.

-import fiftyone.core.odm as foo
+import fiftyone.core.odm as odm
tests/unittests/execution_store_unit_tests.py (1)

Line range hint 347-365: Critical: Fix incorrect test method nesting

The test_create_store_with_dataset_id method is incorrectly nested inside test_delete_key_with_dataset_id. This will prevent the test runner from discovering and executing this test.

Fix the indentation to align with other test methods:

-        def test_create_store_with_dataset_id(self):
-            self.store_service.create_store("widgets")
-            self.mock_collection.insert_one.assert_called_once()
-            self.mock_collection.insert_one.assert_called_with(
-                {
-                    "store_name": "widgets",
-                    "key": "__store__",  # Include this in your expected call
-                    "value": None,
-                    "dataset_id": self.dataset_id,
-                    "created_at": IsDateTime(),
-                    "updated_at": None,
-                    "expires_at": None,
-                }
-            )
+    def test_create_store_with_dataset_id(self):
+        self.store_service.create_store("widgets")
+        self.mock_collection.insert_one.assert_called_once()
+        self.mock_collection.insert_one.assert_called_with(
+            {
+                "store_name": "widgets",
+                "key": "__store__",  # Include this in your expected call
+                "value": None,
+                "dataset_id": self.dataset_id,
+                "created_at": IsDateTime(),
+                "updated_at": None,
+                "expires_at": None,
+            }
+        )
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between ca7c135 and 1206538.

📒 Files selected for processing (6)
  • fiftyone/factory/repo_factory.py (1 hunks)
  • fiftyone/factory/repos/execution_store.py (5 hunks)
  • fiftyone/operators/store/service.py (5 hunks)
  • tests/unittests/decorators.py (2 hunks)
  • tests/unittests/execution_store_service_tests.py (1 hunks)
  • tests/unittests/execution_store_unit_tests.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • fiftyone/factory/repo_factory.py
  • fiftyone/operators/store/service.py
  • tests/unittests/execution_store_service_tests.py
🔇 Additional comments (10)
tests/unittests/decorators.py (1)

47-62: Verify decorator behavior with test collections

Let's ensure the decorator works correctly with test collections and properly cleans up.

✅ Verification successful

Decorator implementation is working as expected

The @drop_collection decorator is actively used in execution_store_service_tests.py with proper test collection cleanup. The implementation is verified through:

  • Consistent usage across 17 test methods with TEST_COLLECTION_NAME
  • Proper combination with other decorators (@drop_datasets)
  • Collection cleanup both before and after test execution
  • Clear test collection naming convention (execution_store_test_collection)
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for existing test implementations using the decorator

# Look for usage patterns of the decorator
rg -l "drop_collection.*__test__" 

# Check for any remaining test collections after test runs
ast-grep --pattern 'drop_collection($collection)' 

Length of output: 87


Script:

#!/bin/bash
# Search for any usage of drop_collection decorator in test files
rg "@drop_collection" -A 3 tests/

# Search for imports of drop_collection
rg "from.*decorators.*drop_collection" tests/

# Look for any test collection patterns in the codebase
rg "test_collection|collection_name.*test" tests/

# Find all test files that might use database collections
fd "test.*\.py$" tests/ --exec grep -l "collection"

Length of output: 6339

fiftyone/factory/repos/execution_store.py (7)

38-46: LGTM! The metadata parameter is properly integrated

The addition of the optional metadata parameter enhances the store creation functionality while maintaining backward compatibility.


50-66: Well-structured implementation of store retrieval logic

The implementation correctly handles both cases where a store exists with or without the __store__ key, aligning with the updated data model requirements.


89-107: Improved counting logic using aggregation pipeline

The change to use an aggregation pipeline provides a more efficient way to count unique stores while respecting the dataset context.


170-175: LGTM! Clean implementation of key existence check

The method provides a straightforward way to check for key existence while properly handling the dataset context.


241-262: Enhanced global store listing with proper document construction

The change to return StoreDocument instances provides a more structured and type-safe approach while maintaining efficiency through aggregation.


265-279: LGTM! Consistent implementation of global store counting

The method follows the same efficient pattern as other counting methods while properly handling edge cases.


281-286: Consider adding validation before global deletion

Referring to the previous review comment about adding safety checks and audit logging for global store deletion.

tests/unittests/execution_store_unit_tests.py (2)

79-79: LGTM: Consistent dataset_id handling

The addition of dataset_id: None in the test assertion is correct and maintains consistency with the service layer changes.


116-116: LGTM: Proper dataset_id assertion

The addition of dataset_id: None in the test assertion is correct and consistent with the service layer changes.

tests/unittests/decorators.py Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (4)
fiftyone/factory/repos/execution_store.py (2)

89-107: Add error handling for aggregation pipeline

While the aggregation pipeline correctly counts unique stores, consider adding error handling for potential MongoDB errors.

     def count_stores(self) -> int:
         pipeline = [
             {
                 "$match": {
                     "dataset_id": self._dataset_id,
                 }
             },
             {
                 "$group": {
                     "_id": {
                         "store_name": "$store_name",
                         "dataset_id": "$dataset_id",
                     }
                 }
             },
             {"$count": "total_stores"},
         ]
-        result = list(self._collection.aggregate(pipeline))
-        return result[0]["total_stores"] if result else 0
+        try:
+            result = list(self._collection.aggregate(pipeline))
+            return result[0]["total_stores"] if result else 0
+        except Exception as e:
+            logger.error(f"Error counting stores: {e}")
+            return 0

265-279: Add consistent error handling

For consistency with the suggested changes to count_stores, consider adding similar error handling here.

     def count_stores_global(self) -> int:
         pipeline = [
             {
                 "$group": {
                     "_id": {
                         "store_name": "$store_name",
                         "dataset_id": "$dataset_id",
                     }
                 }
             },
             {"$count": "total_stores"},
         ]
-        result = list(self._collection.aggregate(pipeline))
-        return result[0]["total_stores"] if result else 0
+        try:
+            result = list(self._collection.aggregate(pipeline))
+            return result[0]["total_stores"] if result else 0
+        except Exception as e:
+            logger.error(f"Error counting global stores: {e}")
+            return 0
tests/unittests/execution_store_unit_tests.py (2)

Line range hint 341-357: Fix indentation of test_create_store_with_dataset_id method

The method is incorrectly indented, making it a nested function within test_delete_store_with_dataset_id. This prevents the test from being discovered and executed by the test runner.

Fix the indentation:

-        def test_create_store_with_dataset_id(self):
-            self.store_service.create_store("widgets")
-            self.mock_collection.insert_one.assert_called_once()
-            self.mock_collection.insert_one.assert_called_with(
-                {
-                    "store_name": "widgets",
-                    "key": "__store__",  # Include this in your expected call
-                    "value": None,
-                    "dataset_id": self.dataset_id,
-                    "created_at": IsDateTime(),
-                    "updated_at": None,
-                    "expires_at": None,
-                }
-            )
+    def test_create_store_with_dataset_id(self):
+        self.store_service.create_store("widgets")
+        self.mock_collection.insert_one.assert_called_once()
+        self.mock_collection.insert_one.assert_called_with(
+            {
+                "store_name": "widgets",
+                "key": "__store__",
+                "value": None,
+                "dataset_id": self.dataset_id,
+                "created_at": IsDateTime(),
+                "updated_at": None,
+                "expires_at": None,
+            }
+        )

Line range hint 256-357: Consider adding negative test cases for dataset_id operations

The test suite thoroughly covers the happy path for dataset_id integration, but could benefit from additional test cases:

  1. Attempting to access a store with incorrect dataset_id
  2. Attempting to modify a store with mismatched dataset_id
  3. Edge cases for dataset_id validation

Would you like me to provide example test cases for these scenarios?

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between ca7c135 and 1206538.

📒 Files selected for processing (6)
  • fiftyone/factory/repo_factory.py (1 hunks)
  • fiftyone/factory/repos/execution_store.py (5 hunks)
  • fiftyone/operators/store/service.py (5 hunks)
  • tests/unittests/decorators.py (2 hunks)
  • tests/unittests/execution_store_service_tests.py (1 hunks)
  • tests/unittests/execution_store_unit_tests.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • fiftyone/factory/repo_factory.py
  • fiftyone/operators/store/service.py
  • tests/unittests/decorators.py
  • tests/unittests/execution_store_service_tests.py
🔇 Additional comments (6)
fiftyone/factory/repos/execution_store.py (5)

13-13: LGTM: Metadata support added to store creation

The addition of metadata support to create_store is well-implemented and properly typed.

Also applies to: 38-46


50-66: LGTM: Robust store retrieval implementation

The implementation correctly handles the case where a store exists without a __store__ key by creating a new StoreDocument. This aligns with the data model where a store can exist through any key with matching store_name.


170-176: LGTM: Clean key existence check implementation

The has_key implementation is straightforward and consistent with other existence checks in the codebase.


241-262: LGTM: Enhanced store listing with full document return

Good improvement to return complete StoreDocument instances instead of just store names, providing more information to callers.


281-286: LGTM with existing safety recommendations

Implementation looks correct. Note that there's an existing review comment suggesting safety checks for this method.

tests/unittests/execution_store_unit_tests.py (1)

79-79: LGTM: Proper dataset_id integration in test assertions

The addition of dataset_id: None in the $setOnInsert block correctly validates the immutable nature of the dataset_id field.

@ritch ritch requested a review from brimoor November 13, 2024 22:04
ritch and others added 8 commits November 13, 2024 15:07
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
brimoor
brimoor previously approved these changes Nov 14, 2024
Copy link
Contributor

@brimoor brimoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

tests/unittests/execution_store_service_tests.py Outdated Show resolved Hide resolved
@ritch ritch merged commit a07b12c into release/v1.1.0 Nov 14, 2024
9 of 11 checks passed
@ritch ritch deleted the estore-int-tests branch November 14, 2024 15:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants