-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Raise ValueError when requesting more cores than available #489
Conversation
WalkthroughThe pull request introduces modifications to several classes within the executor library, primarily focusing on the addition of a Changes
Possibly related PRs
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (11)
tests/test_executor_backend_mpi.py (2)
55-60
: LGTM! Test effectively validates core allocation limits.The test correctly verifies that the executor raises a ValueError when requesting more cores than available. This aligns well with the PR objective.
Consider these minor improvements:
- def test_oversubscribe(self): - with self.assertRaises(ValueError): - with Executor(max_cores=1, backend="local", block_allocation=True) as exe: - cloudpickle_register(ind=1) - fs_1 = exe.submit(calc, 1, resource_dict={"cores": 2}) + def test_oversubscribe(self): + with self.assertRaises(ValueError), \ + Executor(max_cores=1, backend="local", block_allocation=True) as exe: + cloudpickle_register(ind=1) + exe.submit(calc, 1, resource_dict={"cores": 2})Changes:
- Combined nested
with
statements for better readability- Removed unused
fs_1
variable since we're only testing for the exception🧰 Tools
🪛 Ruff
56-57: Use a single
with
statement with multiple contexts instead of nestedwith
statements(SIM117)
59-59: Local variable
fs_1
is assigned to but never usedRemove assignment to unused variable
fs_1
(F841)
55-60
: Consider adding more test cases for comprehensive coverage.While the current test covers the basic oversubscription case, consider adding tests for:
- Requesting exactly max_cores (should succeed)
- Requesting 0 or negative cores (should raise ValueError)
- Edge cases with None or non-integer core values
Here's a suggested implementation:
def test_core_allocation_validation(self): # Test exact allocation (should succeed) with Executor(max_cores=2, backend="local", block_allocation=True) as exe: future = exe.submit(calc, 1, resource_dict={"cores": 2}) self.assertEqual(future.result(), 1) # Test invalid core requests invalid_cores = [0, -1, None, "2"] for cores in invalid_cores: with self.assertRaises(ValueError): with Executor(max_cores=2, backend="local", block_allocation=True) as exe: exe.submit(calc, 1, resource_dict={"cores": cores})🧰 Tools
🪛 Ruff
56-57: Use a single
with
statement with multiple contexts instead of nestedwith
statements(SIM117)
59-59: Local variable
fs_1
is assigned to but never usedRemove assignment to unused variable
fs_1
(F841)
executorlib/cache/executor.py (2)
Line range hint
52-60
: Add validation for cores in resource_dictThe resource dictionary accepts cores without validation. This could lead to issues with negative, zero, or non-integer values.
Consider adding validation:
if resource_dict is None: resource_dict = {} + if "cores" in resource_dict: + cores = resource_dict["cores"] + if not isinstance(cores, int) or cores <= 0: + raise ValueError("cores must be a positive integer") resource_dict.update( {k: v for k, v in default_resource_dict.items() if k not in resource_dict} )
Line range hint
89-126
: Critical: max_cores parameter is not propagated to FileExecutorThe
create_file_executor
function accepts and validatesmax_cores
, but this value isn't passed to theFileExecutor
instance. This breaks the core validation chain.Update the FileExecutor instantiation to include max_cores:
return FileExecutor( cache_directory=cache_directory, - resource_dict=resource_dict, + resource_dict={**(resource_dict or {}), "cores": max_cores}, pysqa_config_directory=pysqa_config_directory, backend=backend.split("pysqa_")[-1], disable_dependencies=disable_dependencies, )executorlib/base/executor.py (3)
21-21
: Enhance the max_cores parameter documentation.The current documentation is too brief. Consider expanding it to include:
- Parameter type (Optional[int])
- Default value (None)
- Validation behavior
- Error conditions
- max_cores (int): defines the number cores which can be used in parallel + max_cores (Optional[int]): Maximum number of cores that can be used in parallel. + If set, raises ValueError when a task requests more cores than this limit. + If None (default), no limit is enforced.
24-24
: Add validation for negative max_cores values.While the initialization looks good, consider adding validation to prevent negative values for max_cores.
def __init__(self, max_cores: Optional[int] = None): """ Initialize the ExecutorBase class. """ cloudpickle_register(ind=3) + if max_cores is not None and max_cores <= 0: + raise ValueError("max_cores must be a positive integer") self._max_cores = max_coresAlso applies to: 29-29
90-98
: Enhance cores validation and error handling.While the validation logic is correct, consider these improvements:
- Add type checking for cores value
- Provide more informative error message
- Consider moving validation to check_resource_dict for consistency
- cores = resource_dict.get("cores", None) - if ( - cores is not None - and self._max_cores is not None - and cores > self._max_cores - ): - raise ValueError( - "The specified number of cores is larger than the available number of cores." - ) + cores = resource_dict.get("cores") + if cores is not None: + if not isinstance(cores, int): + raise TypeError("cores must be an integer") + if self._max_cores is not None and cores > self._max_cores: + raise ValueError( + f"Requested {cores} cores exceeds maximum allowed cores ({self._max_cores})" + )🧰 Tools
🪛 Ruff
90-90: Use
resource_dict.get("cores")
instead ofresource_dict.get("cores", None)
Replace
resource_dict.get("cores", None)
withresource_dict.get("cores")
(SIM910)
executorlib/interactive/executor.py (2)
Line range hint
29-57
: Update docstring to document max_cores parameter.The class docstring should be updated to include the new
max_cores
parameter in the Args section.Add this to the Args section:
refresh_rate (float, optional): The refresh rate for updating the executor queue. Defaults to 0.01. plot_dependency_graph (bool, optional): Whether to generate and plot the dependency graph. Defaults to False. + max_cores (Optional[int], optional): Maximum number of cores that can be used. Defaults to None. *args: Variable length argument list.
🧰 Tools
🪛 Ruff
63-63: Use
kwargs.get("max_cores")
instead ofkwargs.get("max_cores", None)
Replace
kwargs.get("max_cores", None)
withkwargs.get("max_cores")
(SIM910)
63-63
: Optimize kwargs.get() call.The explicit None default is redundant as it's the default return value for dict.get().
- super().__init__(max_cores=kwargs.get("max_cores", None)) + super().__init__(max_cores=kwargs.get("max_cores"))🧰 Tools
🪛 Ruff
63-63: Use
kwargs.get("max_cores")
instead ofkwargs.get("max_cores", None)
Replace
kwargs.get("max_cores", None)
withkwargs.get("max_cores")
(SIM910)
executorlib/interactive/shared.py (2)
134-134
: Simplify dictionary access and ensure consistent max_cores handling.The
.get()
call can be simplified, and for consistency withInteractiveStepExecutor
, consider addingmax_cores
toexecutor_kwargs
.- super().__init__(max_cores=executor_kwargs.get("max_cores", None)) + super().__init__(max_cores=executor_kwargs.get("max_cores")) + executor_kwargs["max_cores"] = executor_kwargs.get("max_cores") executor_kwargs["future_queue"] = self._future_queue🧰 Tools
🪛 Ruff
134-134: Use
executor_kwargs.get("max_cores")
instead ofexecutor_kwargs.get("max_cores", None)
Replace
executor_kwargs.get("max_cores", None)
withexecutor_kwargs.get("max_cores")
(SIM910)
186-186
: Simplify dictionary access.The
.get()
call includes a redundant default value.- super().__init__(max_cores=executor_kwargs.get("max_cores", None)) + super().__init__(max_cores=executor_kwargs.get("max_cores"))🧰 Tools
🪛 Ruff
186-186: Use
executor_kwargs.get("max_cores")
instead ofexecutor_kwargs.get("max_cores", None)
Replace
executor_kwargs.get("max_cores", None)
withexecutor_kwargs.get("max_cores")
(SIM910)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (5)
executorlib/base/executor.py
(2 hunks)executorlib/cache/executor.py
(1 hunks)executorlib/interactive/executor.py
(1 hunks)executorlib/interactive/shared.py
(2 hunks)tests/test_executor_backend_mpi.py
(1 hunks)
🧰 Additional context used
🪛 Ruff
executorlib/base/executor.py
90-90: Use resource_dict.get("cores")
instead of resource_dict.get("cores", None)
Replace resource_dict.get("cores", None)
with resource_dict.get("cores")
(SIM910)
executorlib/interactive/executor.py
63-63: Use kwargs.get("max_cores")
instead of kwargs.get("max_cores", None)
Replace kwargs.get("max_cores", None)
with kwargs.get("max_cores")
(SIM910)
executorlib/interactive/shared.py
134-134: Use executor_kwargs.get("max_cores")
instead of executor_kwargs.get("max_cores", None)
Replace executor_kwargs.get("max_cores", None)
with executor_kwargs.get("max_cores")
(SIM910)
186-186: Use executor_kwargs.get("max_cores")
instead of executor_kwargs.get("max_cores", None)
Replace executor_kwargs.get("max_cores", None)
with executor_kwargs.get("max_cores")
(SIM910)
tests/test_executor_backend_mpi.py
56-57: Use a single with
statement with multiple contexts instead of nested with
statements
(SIM117)
59-59: Local variable fs_1
is assigned to but never used
Remove assignment to unused variable fs_1
(F841)
🔇 Additional comments (6)
executorlib/cache/executor.py (1)
Line range hint 1-126
: Verify FileExecutor usage across the codebase
Let's verify how FileExecutor is used across the codebase to ensure these changes won't cause unexpected issues.
✅ Verification successful
No issues found with FileExecutor changes
The verification shows that the changes to FileExecutor are safe because:
- All direct FileExecutor instantiations in tests properly handle the
resource_dict
with cores parameter - All indirect usages via
create_file_executor
pass theresource_dict
correctly - The default behavior (cores=1) is preserved in the FileExecutor implementation
- Resource dictionary usage across the codebase consistently follows the same pattern of specifying cores
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for FileExecutor instantiations and their resource_dict usage
echo "Direct FileExecutor instantiations:"
rg -A 5 "FileExecutor\(" --type python
echo -e "\nIndirect instantiations via create_file_executor:"
rg -A 5 "create_file_executor\(" --type python
echo -e "\nResource dictionary definitions:"
rg "resource_dict.*=.*{.*cores.*:.*" --type python
Length of output: 614
Script:
#!/bin/bash
# Search for FileExecutor instantiations and their resource_dict usage with fixed regex
echo "Direct FileExecutor instantiations:"
rg "FileExecutor\(" -A 5
echo -e "\nIndirect instantiations via create_file_executor:"
rg "create_file_executor\(" -A 5
echo -e "\nResource dictionary definitions:"
rg "resource_dict\s*=\s*\{[^}]*cores"
Length of output: 6280
executorlib/base/executor.py (1)
99-99
: Verify cores validation in check_resource_dict.
The cores validation happens before check_resource_dict
. We should verify if check_resource_dict
also performs cores validation to avoid redundancy or inconsistency.
✅ Verification successful
No redundant cores validation in check_resource_dict
Based on the search results, check_resource_dict
does not perform any cores validation. The cores-related validations are handled separately in check_max_workers_and_cores
and validate_number_of_cores
functions. Therefore, there is no redundancy or inconsistency in the validation flow.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check if check_resource_dict performs any cores validation
# Search for cores validation in check_resource_dict implementation
rg -A 5 "def check_resource_dict"
rg "cores" --context 3 "inputcheck.py"
Length of output: 1049
Script:
#!/bin/bash
# Search for cores validation in check_resource_dict with full path
rg "cores" --context 3 "executorlib/standalone/inputcheck.py"
# Get the full implementation of check_resource_dict
ast-grep --pattern 'def check_resource_dict($$$) {
$$$
}'
Length of output: 967
executorlib/interactive/executor.py (2)
63-63
: LGTM! Core validation properly propagated to base class.
The change correctly forwards the max_cores parameter to ExecutorBase for core validation.
🧰 Tools
🪛 Ruff
63-63: Use kwargs.get("max_cores")
instead of kwargs.get("max_cores", None)
Replace kwargs.get("max_cores", None)
with kwargs.get("max_cores")
(SIM910)
63-63
: Verify core validation integration.
Let's verify that the core validation is properly integrated through the inheritance chain.
✅ Verification successful
Based on the shell script results, I can generate the final response:
Core validation integration is properly implemented
The inheritance chain and core validation are correctly integrated:
- Base class
ExecutorBase
properly defines and initializesmax_cores
parameter - All executor implementations consistently pass
max_cores
through super().init() - Core validation is centralized in shared utilities and properly used across implementations
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify core validation implementation in base class and its usage
# Test 1: Check base class implementation
echo "Checking ExecutorBase implementation..."
rg -A 5 "class ExecutorBase" executorlib/base/executor.py
# Test 2: Check for other executors that might need similar changes
echo "Checking other executor implementations..."
rg -l "class.*Executor.*\("
# Test 3: Check for max_cores parameter usage
echo "Checking max_cores parameter usage..."
rg "max_cores.*=.*"
Length of output: 6788
🧰 Tools
🪛 Ruff
63-63: Use kwargs.get("max_cores")
instead of kwargs.get("max_cores", None)
Replace kwargs.get("max_cores", None)
with kwargs.get("max_cores")
(SIM910)
executorlib/interactive/shared.py (2)
Line range hint 379-389
: LGTM! Robust resource management implementation.
The _wait_for_free_slots
function correctly implements the core requirement validation, ensuring tasks don't exceed the maximum available cores.
🧰 Tools
🪛 Ruff
131-131: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
134-134: Use executor_kwargs.get("max_cores")
instead of executor_kwargs.get("max_cores", None)
Replace executor_kwargs.get("max_cores", None)
with executor_kwargs.get("max_cores")
(SIM910)
134-134
: Verify error handling for core oversubscription.
Let's verify that the parent class properly raises ValueError when core limits are exceeded.
Also applies to: 186-186
✅ Verification successful
Core oversubscription error handling is properly implemented
The parent class ExecutorBase
correctly implements the core validation by raising a ValueError
when the requested cores exceed max_cores
. The error message "The specified number of cores is larger than the available number of cores." is clear and descriptive.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify the implementation of ValueError in parent class
# Expected: Find the error handling implementation in ExecutorBase
# Search for the parent class implementation
echo "Searching for ExecutorBase implementation..."
rg -A 10 "class ExecutorBase"
# Search for core validation logic
echo "Searching for core validation logic..."
rg -A 5 "raise ValueError.*cores"
Length of output: 861
Script:
#!/bin/bash
# Let's check the complete implementation of ExecutorBase.__init__
ast-grep --pattern 'def __init__(self, max_cores: Optional[int] = None) {
$$$
}'
# Also search for any validation logic in the class
rg -B 2 -A 5 "max_cores.*=" executorlib/base/executor.py
Length of output: 511
Script:
#!/bin/bash
# Let's check where max_cores is used in the codebase
rg -B 2 -A 5 "self._max_cores" executorlib/
# Also check for any property or method that might use max_cores for validation
ast-grep --pattern 'def $_(self) {
if self._max_cores $$$:
$$$
}'
Length of output: 1177
🧰 Tools
🪛 Ruff
134-134: Use executor_kwargs.get("max_cores")
instead of executor_kwargs.get("max_cores", None)
Replace executor_kwargs.get("max_cores", None)
with executor_kwargs.get("max_cores")
(SIM910)
@@ -48,7 +48,7 @@ def __init__( | |||
backend (str, optional): name of the backend used to spawn tasks. | |||
disable_dependencies (boolean): Disable resolving future objects during the submission. | |||
""" | |||
super().__init__() | |||
super().__init__(max_cores=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Passing None for max_cores bypasses core validation
Setting max_cores=None
in the superclass constructor will disable the core validation mechanism, contradicting the PR's objective of raising ValueError when requesting more cores than available.
Consider this fix:
- super().__init__(max_cores=None)
+ super().__init__(max_cores=resource_dict.get("cores"))
Committable suggestion skipped: line range outside the PR's diff.
Summary by CodeRabbit
Release Notes
New Features
max_cores
parameter for enhanced resource management across various executor classes.Bug Fixes
ValueError
when task submissions exceed the specified core limits.Tests