Use resource dict for internal communication #455

jan-janssen · 2024-10-28T06:44:01Z

Summary by CodeRabbit

New Features
- Streamlined resource management by consolidating multiple parameters into a single resource_dict for executor creation.
- Enhanced flexibility in resource allocation through updated function signatures and improved examples in documentation.
Bug Fixes
- Maintained existing functionality and error handling in the executor's dependency management.
Tests
- Updated test cases to align with the new resource allocation method while ensuring the integrity of the tests remains intact.
Documentation
- Revised README and notebook examples to reflect the new resource management approach and improve user guidance.

for more information, see https://pre-commit.ci

coderabbitai · 2024-10-28T06:44:08Z

Walkthrough

The pull request introduces significant modifications to the Executor class and related functions within the executorlib module. The __new__ method of the Executor class now utilizes a consolidated resource_dict to manage resource-related parameters, replacing multiple individual arguments. The create_executor function similarly adopts this approach, simplifying its signature and enhancing resource management. Corresponding updates are made to the ExecutorWithDependencies class and related test cases to reflect these changes while maintaining existing functionality.

Changes

File	Change Summary
executorlib/init.py	Updated `__new__` method in `Executor` class to use `resource_dict` for resource parameters instead of individual arguments, streamlining method signature.
executorlib/interactive/executor.py	Modified `create_executor` function to accept a single `resource_dict` parameter, replacing multiple parameters. Introduced `default_resource_dict` for default values and updated validation logic.
tests/test_dependencies_executor.py	Updated `test_dependency_steps` method to use `resource_dict` for resource allocation in tests, ensuring existing functionality is validated with the new structure.
tests/test_executor_backend_flux.py	Changed resource specifications in tests to utilize `resource_dict` instead of individual parameters for `Executor` initialization.
tests/test_executor_backend_mpi.py	Updated test cases to replace `cores_per_worker` with `resource_dict`, consolidating resource parameters for better organization.
tests/test_executor_backend_mpi_noblock.py	Modified tests to use `resource_dict` for resource specifications, maintaining error handling logic while streamlining parameter management.
README.md	Updated example code to reflect the new `resource_dict` parameter in the `Executor` instantiation, enhancing clarity in usage documentation.
notebooks/examples.ipynb	Enhanced examples to illustrate the use of `resource_dict` in `Executor`, providing clearer context and explanations for users.

Possibly related PRs

Remove conda support in executorlib #418: The main PR modifies the Executor class to consolidate resource parameters into a resource_dict, which aligns with the changes in this PR that removes conda support, indicating a shift in how resources are managed.
Automatically set hostname_localhost #433: This PR also modifies the Executor class constructor to change the handling of the hostname_localhost parameter, which is relevant as the main PR updates the constructor to include a resource_dict.
Cache: Use explicit arguments for serialize_funct_h5() #448: The changes in this PR involve adding a resource_dict parameter to the serialize_funct_h5 function, which is related to the overall theme of enhancing resource management in the executor framework.
Cache: Implement resource dict for submit() #451: This PR implements a resource_dict for the submit() method in the FileExecutor class, directly connecting to the main PR's focus on consolidating resource parameters into a single dictionary.

🐇 In the land of code, changes do bloom,
A resource_dict now clears the room.
With parameters neat, and logic so bright,
Our executors dance in the soft, moonlight.
Simplified paths, like a rabbit's swift hop,
In the garden of code, we’ll never stop! 🌼

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (2)

tests/test_dependencies_executor.py (1)
102-107: Consider adding test cases for resource_dict edge cases.

The test could be enhanced by adding cases that verify:

Empty resource_dict

Partial resource specifications

Invalid resource values

Example test case:
def test_resource_dict_validation(self):
    with self.assertRaises(ValueError):
        create_executor(
            max_workers=1,
            max_cores=2,
            resource_dict={
                "cores_per_worker": 0,  # Invalid value
                "threads_per_core": -1,  # Invalid value
            },
            backend="local"
        )
executorlib/__init__.py (1)

Line range hint 20-214: Document API changes and provide migration guide.

The consolidation of resource parameters into a dictionary represents a significant change to the public API. While this improves internal communication, it may affect existing code that uses the Executor class.

Consider the following recommendations:

Update the class docstring to reflect the new resource_dict parameter structure

Add a migration guide in the documentation to help users transition from individual parameters to the new dictionary-based approach

Consider adding a deprecation warning when individual parameters are used directly

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 09c15f1 and 10655ef.

📒 Files selected for processing (3)

executorlib/init.py (2 hunks)
executorlib/interactive/executor.py (3 hunks)
tests/test_dependencies_executor.py (1 hunks)

🔇 Additional comments (2)

executorlib/__init__.py (2)
Line range hint 200-214: Verify create_executor implementation.

The resource_dict parameter is now passed to create_executor. Please ensure that the create_executor function has been properly updated to handle this new parameter structure.
#!/bin/bash
# Description: Verify create_executor implementation
# Test: Check if create_executor function has been updated to handle resource_dict

# Search for create_executor function definition and its usage of resource_dict
ast-grep --pattern 'def create_executor($$$) {
  $$$
}'

# Search for any references to the old individual resource parameters
rg 'def create_executor' -A 10 --type python
Line range hint 183-199: Verify ExecutorWithDependencies implementation.

The resource_dict parameter is now passed to ExecutorWithDependencies. Please ensure that the ExecutorWithDependencies class has been properly updated to handle this new parameter structure.

✅ Verification successful

Based on the search results, I can now generate the final response:

ExecutorWithDependencies implementation has been properly updated to handle resource_dict.

The search results show that the ExecutorWithDependencies class inherits from ExecutorBase and uses the resource_dict parameter correctly. The resource dictionary is passed through the inheritance chain and properly handled in the executor implementation. The code shows consistent usage of resource_dict throughout the codebase, including:

Proper initialization in the base executor classes

Consistent parameter passing in the dependency executor

Resource dictionary validation and processing

Correct mapping of old resource parameters to the new resource_dict structure
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify ExecutorWithDependencies implementation
# Test: Check if ExecutorWithDependencies class has been updated to handle resource_dict

# Search for ExecutorWithDependencies class definition and its usage of resource_dict
ast-grep --pattern 'class ExecutorWithDependencies {
  $$$
  def __init__($$$) {
    $$$
  }
  $$$
}'

# Search for any references to the old individual resource parameters
rg -A 5 'cores_per_worker|threads_per_core|gpus_per_worker|cwd|openmpi_oversubscribe|slurm_cmd_args' --type python
Length of output: 377

Script:
#!/bin/bash
# Let's try a different approach to find the ExecutorWithDependencies class and its usage of resource_dict

# Search for ExecutorWithDependencies class definition
rg "class ExecutorWithDependencies" -A 20

# Search for resource_dict in the codebase
rg "resource_dict" -A 5

# Search for the old resource parameters without file type restriction
rg -A 5 'cores_per_worker|threads_per_core|gpus_per_worker|cwd|openmpi_oversubscribe|slurm_cmd_args'
Length of output: 107298

coderabbitai · 2024-10-28T06:47:31Z

tests/test_dependencies_executor.py

+            resource_dict={
+                "cores_per_worker": 1,
+                "threads_per_core": 1,
+                "gpus_per_worker": 0,
+                "openmpi_oversubscribe": False,
+            },


💡 Codebase verification

Migration to resource_dict is incomplete across test files

The codebase shows inconsistent usage of resource parameters. While the reviewed test file correctly uses the consolidated resource_dict, many other test files are still using individual parameters (cores_per_worker, threads_per_core, etc.) directly. This indicates that the migration to resource_dict is not complete.

Key files needing attention:

tests/test_flux_executor.py: Uses direct parameter passing

tests/test_local_executor.py: Mixed usage of old and new patterns

tests/test_shell_executor.py: Still using individual parameters

Multiple test files using execute_parallel_tasks with individual parameters

🔗 Analysis chain

LGTM! Consider enhancing resource allocation verification.

The resource_dict implementation correctly consolidates the resource parameters. The test maintains the existing functionality while adapting to the new API.

Let's verify if other tests in the codebase are consistently using the new resource_dict pattern:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Search for any remaining direct usage of the old parameter names rg -l "cores_per_worker|threads_per_core|gpus_per_worker|openmpi_oversubscribe" "tests/"

Length of output: 562

Script:

#!/bin/bash # Let's check the actual usage patterns in these files to determine if they're using resource_dict rg -A 3 -B 3 "cores_per_worker|threads_per_core|gpus_per_worker|openmpi_oversubscribe" "tests/" --no-filename

Length of output: 9537

coderabbitai · 2024-10-28T06:47:31Z

executorlib/__init__.py

+        resource_dict = {
+            "cores": cores_per_worker,
+            "threads_per_core": threads_per_core,
+            "gpus_per_core": int(gpus_per_worker / cores_per_worker),
+            "cwd": cwd,
+            "openmpi_oversubscribe": openmpi_oversubscribe,
+            "slurm_cmd_args": slurm_cmd_args,
+        }


⚠️ Potential issue

Validate GPU allocation and improve naming consistency.

The resource dictionary creation has potential issues:

The GPU division could lead to unexpected results if gpus_per_worker is not evenly divisible by cores_per_worker

Missing validation for cores_per_worker being non-zero

Inconsistent parameter naming between input and dictionary ("cores" vs "cores_per_worker")

Consider applying these improvements:

resource_dict = { - "cores": cores_per_worker, + "cores_per_worker": cores_per_worker, "threads_per_core": threads_per_core, - "gpus_per_core": int(gpus_per_worker / cores_per_worker), + "gpus_per_core": (gpus_per_worker // cores_per_worker) if cores_per_worker > 0 else 0, "cwd": cwd, "openmpi_oversubscribe": openmpi_oversubscribe, "slurm_cmd_args": slurm_cmd_args, } +if cores_per_worker > 0 and gpus_per_worker % cores_per_worker != 0: + raise ValueError(f"gpus_per_worker ({gpus_per_worker}) must be evenly divisible by cores_per_worker ({cores_per_worker})")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

resource_dict = {

"cores": cores_per_worker,

"threads_per_core": threads_per_core,

"gpus_per_core": int(gpus_per_worker / cores_per_worker),

"cwd": cwd,

"openmpi_oversubscribe": openmpi_oversubscribe,

"slurm_cmd_args": slurm_cmd_args,

}

resource_dict = {

"cores_per_worker": cores_per_worker,

"threads_per_core": threads_per_core,

"gpus_per_core": (gpus_per_worker // cores_per_worker) if cores_per_worker > 0 else 0,

"cwd": cwd,

"openmpi_oversubscribe": openmpi_oversubscribe,

"slurm_cmd_args": slurm_cmd_args,

}

if cores_per_worker > 0 and gpus_per_worker % cores_per_worker != 0:

raise ValueError(f"gpus_per_worker ({gpus_per_worker}) must be evenly divisible by cores_per_worker ({cores_per_worker})")

coderabbitai · 2024-10-28T06:47:31Z

executorlib/interactive/executor.py

+        resource_dict (dict): A dictionary of resources required by the task. With the following keys:
+                              - cores_per_worker (int): number of MPI cores to be used for each function call
+                              - threads_per_core (int): number of OpenMP threads to be used for each function call
+                              - gpus_per_worker (int): number of GPUs per worker - defaults to 0
+                              - cwd (str/None): current working directory where the parallel python task is executed
+                              - openmpi_oversubscribe (bool): adds the `--oversubscribe` command line flag (OpenMPI and
+                                                              SLURM only) - default False
+                              - slurm_cmd_args (list): Additional command line arguments for the srun call (SLURM only)


⚠️ Potential issue

Inconsistent key names in resource_dict

The keys used in resource_dict are inconsistent between the documentation, default values, and usage in the code:

In the docstring (lines 177-184), keys like cores_per_worker and gpus_per_worker are documented.

In default_resource_dict (lines 205-211), the keys are cores and gpus_per_core.

In usage (lines 218-219, 263-264), keys like cores_per_worker and gpus_per_worker are expected.

This inconsistency can lead to confusion and potential bugs since the keys may not match as intended.

To resolve this inconsistency, align the key names across the docstring, default values, and code usage. Update default_resource_dict and related code to use consistent key names:

default_resource_dict = { - "cores": 1, + "cores_per_worker": 1, "threads_per_core": 1, - "gpus_per_core": 0, + "gpus_per_worker": 0, "cwd": None, "openmpi_oversubscribe": False, "slurm_cmd_args": [], }

Update the usage in the code accordingly:

-cores_per_worker = resource_dict["cores"] +cores_per_worker = resource_dict["cores_per_worker"] ... -check_gpus_per_worker(gpus_per_worker=resource_dict["gpus_per_worker"]) +check_gpus_per_worker(gpus_per_worker=resource_dict["gpus_per_worker"])

Ensure that all references to these keys are updated for consistency.

Also applies to: 205-211, 218-219, 263-264

…ernal_resource_dict

review-notebook-app · 2024-10-28T07:44:44Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

for more information, see https://pre-commit.ci

coderabbitai

Actionable comments posted: 5

🧹 Outside diff range and nitpick comments (10)

tests/test_executor_backend_mpi_noblock.py (2)
Line range hint 8-10: Rename the helper function to avoid parameter shadowing.

The helper function resource_dict has the same name as its parameter, which could lead to confusion and potential issues. Consider renaming it to be more descriptive of its purpose.
-def resource_dict(resource_dict):
+def validate_resource_dict(resource_dict):
     return resource_dict
Line range hint 42-77: Enhance test coverage for resource dictionary validation.

The current tests only cover basic error cases. Consider adding tests for:

Invalid resource values (negative or zero)

Missing required keys in the resource_dict

Conflicting resource specifications

Edge cases with maximum allowed values

Would you like me to help generate additional test cases to improve coverage?
tests/test_executor_backend_flux.py (1)

Line range hint 63-106: Consider adding documentation for resource_dict structure

While the migration to resource_dict is implemented consistently across all test cases, it would be beneficial to add documentation that describes:

The available keys in the resource_dict

The expected value types and ranges

The relationship between different resource specifications (if any)

This would help users understand the new API better and ensure correct usage.

tests/test_executor_backend_mpi.py (1)

74-82: Consider adding more comprehensive resource dictionary test cases.

The test suite would benefit from additional test cases covering:

Empty resource dictionary

Resource dictionary with unknown keys

Resource dictionary with invalid value types

Resource dictionary with negative or zero values

Complex combinations of resource parameters

This would help ensure robust validation of the new resource management system.

Would you like me to help generate these additional test cases?
README.md (2)
46-46: Add migration guidance for the new resource_dict parameter.

The change from individual parameters to a consolidated resource_dict represents a significant API change. Consider adding a section that:

Explains the motivation behind using resource_dict

Lists all supported resource types

Provides migration examples from old to new format

Example addition:
### Resource Dictionary
The `resource_dict` parameter provides a flexible way to specify various computing resources:
- `cores`: Number of CPU cores per worker
- `threads`: Number of threads per core
- `gpus`: Number of GPUs per worker

Previous API:
```python
with Executor(cores_per_worker=2, threads_per_core=4) as exe:
    ...
New API:
with Executor(resource_dict={"cores": 2, "threads": 4}) as exe:
    ...
---

`46-46`: **Consider expanding the example to showcase resource_dict capabilities.**

The current example only demonstrates setting cores. Consider expanding it to show:
1. Multiple resource types (cores, threads, GPUs)
2. Resource combinations
3. Common use cases


Example addition:
```python
# Example with multiple resource types
resource_dict = {
    "cores": 2,
    "threads": 4,
    "gpus": 1
}
with Executor(max_cores=2, executor=flux_exe, resource_dict=resource_dict) as exe:
    fs = exe.submit(calc, 3)
    print(fs.result())
executorlib/__init__.py (1)
39-46: Enhance resource_dict documentation with defaults and update example.

While the documentation clearly lists all available keys, it would be helpful to:

Document the default values for each parameter

Update the example in the class docstring to demonstrate the new resource_dict usage pattern

Consider updating the example to:
>>> with Executor(resource_dict={
>>>     "cores_per_worker": 2,
>>>     "threads_per_core": 1
>>> }, init_function=init_k) as p:
>>>     fs = p.submit(calc, 2, j=4)
>>>     print(fs.result())
notebooks/examples.ipynb (3)
216-223: Consider enhancing resource dictionary documentation with type hints.

The resource dictionary implementation effectively consolidates resource parameters. Consider adding type hints in the documentation for each parameter to make it clearer what values are expected.

Example documentation format:
resource_dict = {
    "cores": int,              # Number of CPU cores
    "threads_per_core": int,   # Number of threads per CPU core
    "gpus_per_core": int,     # Number of GPUs per core
    "cwd": str,               # Working directory path
    "openmpi_oversubscribe": bool,  # Enable/disable OpenMPI oversubscription
    "slurm_cmd_args": list,   # Additional SLURM command arguments
}
476-482: Consider adding more comprehensive resource configuration examples.

While the current example effectively demonstrates basic resource configuration, consider adding examples that show:

Combined CPU and GPU configurations

Different thread-per-core scenarios

Resource oversubscription cases

Example additions:
# CPU + GPU configuration
resource_dict={
    "cores": 2,
    "gpus_per_core": 1,
    "threads_per_core": 2
}

# Oversubscription example
resource_dict={
    "cores": 4,
    "threads_per_core": 2,
    "openmpi_oversubscribe": True
}
Line range hint 1-763: Consider adding a troubleshooting section to the documentation.

The documentation is comprehensive and well-structured. Consider adding a troubleshooting section that covers:

Common error scenarios and their solutions

Resource configuration pitfalls

Platform-specific issues (Linux, MacOS, Windows)

Performance optimization tips

This would help users quickly resolve common issues and optimize their resource usage.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 9988992 and 5b23f7e.

📒 Files selected for processing (8)

README.md (1 hunks)
executorlib/init.py (6 hunks)
executorlib/interactive/executor.py (3 hunks)
notebooks/examples.ipynb (26 hunks)
tests/test_dependencies_executor.py (1 hunks)
tests/test_executor_backend_flux.py (4 hunks)
tests/test_executor_backend_mpi.py (3 hunks)
tests/test_executor_backend_mpi_noblock.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

tests/test_dependencies_executor.py

🔇 Additional comments (9)

tests/test_executor_backend_mpi_noblock.py (1)
48-51: LGTM! The resource dictionary implementation maintains error handling.

The test cases correctly verify TypeError exceptions for invalid resource configurations using the new resource_dict parameter structure.

Let's verify the error handling implementation in the Executor class:

Also applies to: 57-60

✅ Verification successful

These scripts will help us:

Find the Executor class implementation

Understand how resource_dict is used across the codebase

Locate all test files related to Executor to verify test coverage

Based on the search results, I can now provide a final response. The codebase shows comprehensive implementation and testing of the resource_dict parameter:

The resource dictionary implementation is correctly tested and implemented

The test cases in test_executor_backend_mpi_noblock.py are part of a well-structured test suite that verifies:

Resource dictionary validation in executorlib/standalone/inputcheck.py

Default values and parameter merging in executorlib/__init__.py

Error handling across different executor backends (MPI, Flux, SLURM)

Consistent resource specification format across the codebase

The test coverage is adequate with multiple test files verifying different aspects of resource dictionary handling.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Search for the resource_dict validation in Executor implementation
ast-grep --pattern 'class Executor {
  $$$
  def __new__($$$) {
    $$$
    resource_dict
    $$$
  }
}'
Length of output: 106

Script:
#!/bin/bash
# Let's try a broader search for Executor class and resource_dict usage
rg -A 5 "class Executor" 
Length of output: 1898

Script:
#!/bin/bash
# Search for resource_dict usage in Python files
rg "resource_dict" --type py -A 3
Length of output: 30868

Script:
#!/bin/bash
# Search for test files related to Executor
fd "test.*executor.*\.py$"
Length of output: 443
tests/test_executor_backend_flux.py (4)
63-63: LGTM: Correct migration to resource_dict format

The threads_per_core parameter has been properly migrated to the new resource_dict format while maintaining the same functionality.

91-91: Same resource_dict pattern as previous test

106-106: Same resource_dict pattern as previous tests

78-78: Verify the resource key naming convention

The parameter has changed from cores_per_worker to cores in the resource_dict. Please verify if this naming change is intentional and consistent with the new API design.

✅ Verification successful

API change from cores_per_worker to cores is intentional and consistent

The codebase shows a clear and consistent transition from cores_per_worker to cores in the resource dictionary. This is evidenced by:

The interactive executor explicitly maps cores_per_worker to cores in executorlib/interactive/executor.py: cores_per_worker = resource_dict["cores"]

All recent test files use resource_dict={"cores": N} format

The cache executor provides backward compatibility by mapping cores_per_worker to cores in executorlib/cache/shared.py

Documentation and examples consistently use the new cores key in resource dictionaries
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Search for other occurrences of cores_per_worker and cores in resource specifications
echo "Searching for cores_per_worker usage:"
rg "cores_per_worker" -A 2 -B 2

echo -e "\nSearching for resource_dict with cores key:"
rg 'resource_dict.*cores' -A 2 -B 2
Length of output: 20809
tests/test_executor_backend_mpi.py (2)
95-95: LGTM! Resource configuration and cache testing look good.

The migration to resource_dict maintains the same test coverage while properly validating both the MPI functionality and caching behavior.

74-82: Enhance error case testing for resource validation.

While the test verifies that TypeError is raised, it could be more comprehensive:

The test doesn't verify the error messages

It's not clear why these resource configurations are invalid

The validation rules for the resource dictionary aren't documented

Let's check if there's documentation or validation logic for the resource dictionary:
#!/bin/bash
# Search for resource dictionary validation logic
rg -A 5 "resource_dict.*validate|validate.*resource_dict"

# Search for resource dictionary documentation
fd "README|CONTRIBUTING" | xargs rg -l "resource_dict"
executorlib/__init__.py (1)

189-189: LGTM: Consistent resource_dict usage.

The resource_dict is correctly passed to both executor creation paths.

Also applies to: 207-207

executorlib/interactive/executor.py (1)

154-154: Introduction of resource_dict enhances resource management

The addition of the resource_dict parameter consolidates resource-related arguments, simplifying the function signature and improving flexibility.

coderabbitai · 2024-10-28T07:53:22Z

tests/test_executor_backend_mpi.py

@@ -58,7 +58,7 @@ def test_meta_executor_single(self):
    def test_meta_executor_parallel(self):
        with Executor(
            max_workers=2,
-            cores_per_worker=2,
+            resource_dict={"cores": 2},


💡 Codebase verification

Migration to resource_dict is incomplete - several files still use cores_per_worker

The search results reveal that several test files and implementation files still use the old cores_per_worker parameter instead of the new resource_dict approach:

tests/test_cache_executor_serial.py

tests/test_cache_executor_pysqa_flux.py

tests/test_cache_executor_mpi.py

The migration needs to be completed consistently across all test files to match the new resource management approach.

🔗 Analysis chain

LGTM! Resource dictionary implementation looks correct.

The migration from cores_per_worker=2 to resource_dict={"cores": 2} maintains the same test coverage while aligning with the new resource management approach.

Let's verify that all test files have been updated to use the new resource_dict:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Search for any remaining usage of cores_per_worker to ensure complete migration rg "cores_per_worker" "tests/"

Length of output: 124

Script:

#!/bin/bash # Let's search for both old and new patterns to understand the migration status echo "=== Searching for cores_per_worker usage ===" rg "cores_per_worker" -A 2 echo -e "\n=== Searching for resource_dict usage in tests ===" rg "resource_dict" "tests/" -A 2

Length of output: 13827

coderabbitai · 2024-10-28T07:53:22Z

README.md

@@ -43,7 +43,7 @@ def calc(i):
    return i, size, rank

 with flux.job.FluxExecutor() as flux_exe:
-    with Executor(max_cores=2, cores_per_worker=2, executor=flux_exe) as exe:
+    with Executor(max_cores=2, executor=flux_exe, resource_dict={"cores": 2}) as exe:


⚠️ Potential issue

Update the explanatory text to match the new API.

The example code has been updated to use resource_dict={"cores": 2}, but the explanatory text below still references the old cores_per_worker=2 parameter. This inconsistency could confuse users.

Consider updating the paragraph that starts with "The interface of the standard..." to reflect the new API:

-The interface of the standard [concurrent.futures.Executor](https://docs.python.org/3/library/concurrent.futures.html#module-concurrent.futures) -is extended by adding the option `cores_per_worker=2` to assign multiple MPI ranks to each function call. +The interface of the standard [concurrent.futures.Executor](https://docs.python.org/3/library/concurrent.futures.html#module-concurrent.futures) +is extended by adding the option `resource_dict={"cores": 2}` to assign multiple MPI ranks to each function call.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

with Executor(max_cores=2, executor=flux_exe, resource_dict={"cores": 2}) as exe:

The interface of the standard [concurrent.futures.Executor](https://docs.python.org/3/library/concurrent.futures.html#module-concurrent.futures)

is extended by adding the option `resource_dict={"cores": 2}` to assign multiple MPI ranks to each function call.

coderabbitai · 2024-10-28T07:53:22Z

executorlib/__init__.py

+        if resource_dict is None:
+            resource_dict = {}
+        resource_dict.update(
+            {k: v for k, v in default_resource_dict.items() if k not in resource_dict}
+        )


⚠️ Potential issue

Add validation for resource_dict values.

The code accepts arbitrary values without validation, which could lead to resource exhaustion or other security issues. Consider adding validation for:

Negative values for cores, threads, and GPUs

Unreasonably large values that could impact system stability

Type checking for numeric parameters

Example implementation:

def validate_resource_dict(resource_dict): for key in ['cores_per_worker', 'threads_per_core', 'gpus_per_worker']: if key in resource_dict: value = resource_dict[key] if not isinstance(value, int): raise TypeError(f"{key} must be an integer") if value < 0: raise ValueError(f"{key} cannot be negative") if value > MAX_ALLOWED_VALUE: # Define appropriate limits raise ValueError(f"{key} exceeds maximum allowed value")

Would you like me to provide a complete implementation with appropriate resource limits?

coderabbitai · 2024-10-28T07:53:22Z

executorlib/__init__.py

+        default_resource_dict = {
+            "cores": 1,
+            "threads_per_core": 1,
+            "gpus_per_core": 0,
+            "cwd": None,
+            "openmpi_oversubscribe": False,
+            "slurm_cmd_args": [],
+        }
+        if resource_dict is None:
+            resource_dict = {}
+        resource_dict.update(
+            {k: v for k, v in default_resource_dict.items() if k not in resource_dict}
+        )


⚠️ Potential issue

Fix inconsistent parameter naming in default_resource_dict.

The keys in default_resource_dict don't match the documented parameter names:

"cores" vs documented "cores_per_worker"

"gpus_per_core" vs documented "gpus_per_worker"

Apply this fix:

default_resource_dict = { - "cores": 1, - "gpus_per_core": 0, + "cores_per_worker": 1, + "gpus_per_worker": 0, "threads_per_core": 1, "cwd": None, "openmpi_oversubscribe": False, "slurm_cmd_args": [], }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

default_resource_dict = {

"cores": 1,

"threads_per_core": 1,

"gpus_per_core": 0,

"cwd": None,

"openmpi_oversubscribe": False,

"slurm_cmd_args": [],

}

if resource_dict is None:

resource_dict = {}

resource_dict.update(

{k: v for k, v in default_resource_dict.items() if k not in resource_dict}

)

default_resource_dict = {

"cores_per_worker": 1,

"threads_per_core": 1,

"gpus_per_worker": 0,

"cwd": None,

"openmpi_oversubscribe": False,

"slurm_cmd_args": [],

}

if resource_dict is None:

resource_dict = {}

resource_dict.update(

{k: v for k, v in default_resource_dict.items() if k not in resource_dict}

)

coderabbitai · 2024-10-28T07:53:23Z

executorlib/interactive/executor.py

        check_executor(executor=flux_executor)
        check_nested_flux_executor(nested_flux_executor=flux_executor_nesting)
-        executor_kwargs["openmpi_oversubscribe"] = openmpi_oversubscribe
+        check_threads_per_core(threads_per_core=resource_dict["threads_per_core"])
+        check_gpus_per_worker(gpus_per_worker=resource_dict["gpus_per_core"])


⚠️ Potential issue

Incorrect key used for gpus_per_worker in local backend

In the call to check_gpus_per_worker, gpus_per_worker=resource_dict["gpus_per_core"] is passed. However, the key should be gpus_per_worker to align with the expected parameter.

Apply this diff to correct the key:

-check_gpus_per_worker(gpus_per_worker=resource_dict["gpus_per_core"]) +check_gpus_per_worker(gpus_per_worker=resource_dict["gpus_per_worker"])

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

check_gpus_per_worker(gpus_per_worker=resource_dict["gpus_per_core"])

check_gpus_per_worker(gpus_per_worker=resource_dict["gpus_per_worker"])

jan-janssen and others added 2 commits October 28, 2024 07:43

Use resource dict for internal communication

ee8a025

[pre-commit.ci] auto fixes from pre-commit.com hooks

10655ef

for more information, see https://pre-commit.ci

coderabbitai bot reviewed Oct 28, 2024

View reviewed changes

jan-janssen added 4 commits October 28, 2024 08:03

fix bug

d08ebf0

Merge remote-tracking branch 'origin/internal_resource_dict' into int…

9988992

…ernal_resource_dict

more fixes

17a5e70

Update user interface to use resource_dict

5f48ec7

[pre-commit.ci] auto fixes from pre-commit.com hooks

5b23f7e

for more information, see https://pre-commit.ci

coderabbitai bot reviewed Oct 28, 2024

View reviewed changes

jan-janssen merged commit ceda4c0 into main Oct 28, 2024
24 checks passed

jan-janssen deleted the internal_resource_dict branch October 28, 2024 08:00

This was referenced Oct 28, 2024

Implement resource_dict for file executor #456

Merged

Executor user interface #458

Closed

Cache: create method #459

Merged

jan-janssen mentioned this pull request Oct 28, 2024

[documentation] Unify interface - provide resource_dict for Executor initialisation #453

Closed

This was referenced Nov 6, 2024

Set max_cores and max_workers to None #478

Merged

Update validate_number_of_cores() #488

Merged

Raise ValueError when requesting more cores than available #489

Merged

This was referenced Dec 20, 2024

Validate maximum number of workers for flux #527

Merged

Move SLURM to separate module #528

Merged

Update Documentation #531

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use resource dict for internal communication #455

Use resource dict for internal communication #455

jan-janssen commented Oct 28, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 28, 2024 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot Oct 28, 2024

coderabbitai bot Oct 28, 2024

coderabbitai bot Oct 28, 2024

review-notebook-app bot commented Oct 28, 2024

coderabbitai bot left a comment

coderabbitai bot Oct 28, 2024

coderabbitai bot Oct 28, 2024

coderabbitai bot Oct 28, 2024

coderabbitai bot Oct 28, 2024

coderabbitai bot Oct 28, 2024

	with Executor(max_cores=2, executor=flux_exe, resource_dict={"cores": 2}) as exe:
	The interface of the standard [concurrent.futures.Executor](https://docs.python.org/3/library/concurrent.futures.html#module-concurrent.futures)
	is extended by adding the option `resource_dict={"cores": 2}` to assign multiple MPI ranks to each function call.

	check_gpus_per_worker(gpus_per_worker=resource_dict["gpus_per_core"])
	check_gpus_per_worker(gpus_per_worker=resource_dict["gpus_per_worker"])

Use resource dict for internal communication #455

Use resource dict for internal communication #455

Conversation

jan-janssen commented Oct 28, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

coderabbitai bot commented Oct 28, 2024 • edited Loading

Walkthrough

Changes

Possibly related PRs

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Oct 28, 2024

Choose a reason for hiding this comment

coderabbitai bot Oct 28, 2024

Choose a reason for hiding this comment

coderabbitai bot Oct 28, 2024

Choose a reason for hiding this comment

review-notebook-app bot commented Oct 28, 2024

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Oct 28, 2024

Choose a reason for hiding this comment

coderabbitai bot Oct 28, 2024

Choose a reason for hiding this comment

coderabbitai bot Oct 28, 2024

Choose a reason for hiding this comment

coderabbitai bot Oct 28, 2024

Choose a reason for hiding this comment

coderabbitai bot Oct 28, 2024

Choose a reason for hiding this comment

jan-janssen commented Oct 28, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 28, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)