-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add type checking with mypy #535
Conversation
for more information, see https://pre-commit.ci
WalkthroughThis pull request introduces a new GitHub Actions workflow for MyPy type checking and implements extensive type annotation improvements across the Changes
Poem
Possibly related PRs
Tip CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Nitpick comments (24)
executorlib/standalone/thread.py (1)
24-26
: Consider adding type hints toget_kwargs
.While the new accessor method is helpful for encapsulation, adding a return type annotation (e.g.,
-> dict
or-> Dict[str, Any]
) would improve clarity for both readers and static type checkers.executorlib/backend/cache_parallel.py (2)
4-4
: Remove unused import.The import
Any
is not used in this file. Removing it will prevent linting or static analysis warnings such asF401
.-import pickle -import sys -import time -from typing import Any +import pickle +import sys +import time🧰 Tools
🪛 Ruff (0.8.2)
4-4:
typing.Any
imported but unusedRemove unused import:
typing.Any
(F401)
38-41
: Optional: inline the if-else statement.You could simplify this block by replacing the if-else statement with a ternary expression:
apply_dict = backend_load_file(file_name=file_name) if mpi_rank_zero else {}However, this is optional. If clarity is more important than brevity, you can keep the current structure.
🧰 Tools
🪛 Ruff (0.8.2)
38-41: Use ternary operator
apply_dict = backend_load_file(file_name=file_name) if mpi_rank_zero else {}
instead ofif
-else
-blockReplace
if
-else
-block withapply_dict = backend_load_file(file_name=file_name) if mpi_rank_zero else {}
(SIM108)
executorlib/standalone/interactive/backend.py (2)
2-2
: Use imports selectively.Ensure every imported name is used. If there are other members from
typing
that remain unused, consider removing them to reduce clutter.
17-17
: Return type clarity.Returning
Any
is appropriate if the function’s output can vary widely. If the return type can be narrowed, consider specifying a more specific type in the future.executorlib/standalone/serialize.py (1)
4-4
: Confirm necessity ofTuple
.Review whether you actually use
Tuple
fromtyping
. If not, removing it will help maintain a clean import list.executorlib/standalone/hdf.py (1)
27-33
: Suggestion to simplify membership check.
You can streamline the condition by removing.keys()
:- if data_key in group_dict.keys(): + if data_key in group_dict:🧰 Tools
🪛 Ruff (0.8.2)
29-29: Use
key in dict
instead ofkey in dict.keys()
Remove
.keys()
(SIM118)
executorlib/standalone/plot.py (1)
21-23
: Use more specialized type hints for clarity.Declaring the exact list and dict types (e.g.,
list[dict]
ordict[str, Any]
) can improve clarity and help catch errors during static analysis.executorlib/standalone/interactive/spawner.py (3)
9-14
: Improve constructor docstring to reflect Optional cwd.The docstring says “cwd (str): …” but the code now specifies
Optional[str]
. Recommend updating the docstring to clarify thatNone
is the default and permissible value.
80-80
: Explicitly type subprocess.Popen as needed.Declaring
Optional[subprocess.Popen[bytes]]
orOptional[subprocess.Popen[str]]
can further improve clarity on which stream encoding is used (if known).
118-122
: Order of calls can be optimized for graceful shutdown.Calling
terminate()
beforecommunicate()
is usually more common for ensuring the process is signaled before collecting its output. Consider reversing these calls if the process is still expected to be running.executorlib/cache/executor.py (2)
24-24
: Document the rationale for type ignore.Including a brief comment explaining why type checking should be ignored here helps maintainers understand the trade-offs if
pysqa
is unavailable.
32-33
: Leverage type aliases for commonly used callables.If the same callable signature is used across multiple classes, consider defining a type alias (e.g.,
TaskFunction = Callable[..., Any]
) to keep code more organized and consistent.executorlib/base/executor.py (1)
42-48
: Simplify membership checks.
According to the static analysis hint (SIM118), consider removing the.keys()
call for membership checks:- if "future_queue" in meta_data_dict.keys(): + if "future_queue" in meta_data_dict:🧰 Tools
🪛 Ruff (0.8.2)
43-43: Use
key in dict
instead ofkey in dict.keys()
Remove
.keys()
(SIM118)
executorlib/standalone/inputcheck.py (1)
121-123
: Empty error message.
Providing an explicit message when raisingValueError
can improve debugging. Consider adding a short, descriptive message:- raise ValueError("") + raise ValueError("Allocation is blocked, but `init_function` is set. This combination is not supported.")executorlib/cache/shared.py (2)
24-24
:Any
return type might be too broad.
Consider narrowing the return type if feasible to better reflect the expected data.
77-79
: Consider using more specific type hints.
Defining dictionaries asdict
is somewhat vague. For example:memory_dict: Dict[str, Future] = {} process_dict: Dict[str, Any] = {} file_name_dict: Dict[str, str] = {}This aids in maintainability and readability.
executorlib/interactive/executor.py (3)
84-85
: Consider more specific type hints instead ofdict
.
If the keys/values are known, declare them asDict[K, V]
for clarity.
231-234
: Use membership tests without.keys()
.- if "openmpi_oversubscribe" in resource_dict.keys(): + if "openmpi_oversubscribe" in resource_dict: del resource_dict["openmpi_oversubscribe"]Similarly for
slurm_cmd_args
.🧰 Tools
🪛 Ruff (0.8.2)
231-231: Use
key in dict
instead ofkey in dict.keys()
Remove
.keys()
(SIM118)
233-233: Use
key in dict
instead ofkey in dict.keys()
Remove
.keys()
(SIM118)
301-306
: Use membership tests without.keys()
.
Apply this pattern elsewhere for consistency.🧰 Tools
🪛 Ruff (0.8.2)
301-301: Use
key in dict
instead ofkey in dict.keys()
Remove
.keys()
(SIM118)
303-303: Use
key in dict
instead ofkey in dict.keys()
Remove
.keys()
(SIM118)
305-305: Use
key in dict
instead ofkey in dict.keys()
Remove
.keys()
(SIM118)
executorlib/__init__.py (1)
180-180
: Usingdict
type is fine locally, but consider more specific hints.
If known, specify key-value types for clarity, e.g.,Dict[str, Union[int, bool]]
.executorlib/interactive/shared.py (1)
289-291
: Consider using more specific container types.
For instance:active_task_dict: Dict[Future, int] = {} process_lst: List[RaisingThread] = [] qtask_lst: List[Queue] = [].github/workflows/mypy.yml (2)
22-22
: Enhance MyPy configuration and fix missing newlineConsider the following improvements:
- Add a
mypy.ini
configuration file to customize type checking settings- Add error codes to the output for better debugging
- Add missing newline at EOF
- run: mypy --ignore-missing-imports ${{ github.event.repository.name }} + run: mypy --ignore-missing-imports --show-error-codes ${{ github.event.repository.name }} +Would you like me to help create a
mypy.ini
configuration file with recommended settings for your project?🧰 Tools
🪛 yamllint (1.35.1)
[error] 22-22: no new line character at the end of file
(new-line-at-end-of-file)
1-22
: Consider adding dependency cachingTo improve workflow execution time, consider caching pip dependencies between runs.
Add this step after Python setup:
- name: Cache pip dependencies uses: actions/cache@v3 with: path: ~/.cache/pip key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements*.txt') }} restore-keys: | ${{ runner.os }}-pip-🧰 Tools
🪛 yamllint (1.35.1)
[error] 22-22: no new line character at the end of file
(new-line-at-end-of-file)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (22)
.github/workflows/mypy.yml
(1 hunks)executorlib/__init__.py
(5 hunks)executorlib/backend/cache_parallel.py
(3 hunks)executorlib/backend/interactive_parallel.py
(4 hunks)executorlib/base/executor.py
(7 hunks)executorlib/cache/executor.py
(5 hunks)executorlib/cache/queue_spawner.py
(1 hunks)executorlib/cache/shared.py
(4 hunks)executorlib/interactive/executor.py
(8 hunks)executorlib/interactive/flux.py
(1 hunks)executorlib/interactive/shared.py
(18 hunks)executorlib/interactive/slurm.py
(1 hunks)executorlib/standalone/__init__.py
(1 hunks)executorlib/standalone/hdf.py
(4 hunks)executorlib/standalone/inputcheck.py
(3 hunks)executorlib/standalone/interactive/backend.py
(1 hunks)executorlib/standalone/interactive/communication.py
(1 hunks)executorlib/standalone/interactive/spawner.py
(3 hunks)executorlib/standalone/plot.py
(1 hunks)executorlib/standalone/serialize.py
(2 hunks)executorlib/standalone/thread.py
(1 hunks)tests/test_integration_pyiron_workflow.py
(3 hunks)
✅ Files skipped from review due to trivial changes (1)
- executorlib/standalone/init.py
🧰 Additional context used
🪛 Ruff (0.8.2)
executorlib/backend/cache_parallel.py
4-4: typing.Any
imported but unused
Remove unused import: typing.Any
(F401)
38-41: Use ternary operator apply_dict = backend_load_file(file_name=file_name) if mpi_rank_zero else {}
instead of if
-else
-block
Replace if
-else
-block with apply_dict = backend_load_file(file_name=file_name) if mpi_rank_zero else {}
(SIM108)
executorlib/standalone/serialize.py
32-32: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
32-32: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
32-32: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
executorlib/interactive/executor.py
169-169: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
231-231: Use key in dict
instead of key in dict.keys()
Remove .keys()
(SIM118)
233-233: Use key in dict
instead of key in dict.keys()
Remove .keys()
(SIM118)
301-301: Use key in dict
instead of key in dict.keys()
Remove .keys()
(SIM118)
303-303: Use key in dict
instead of key in dict.keys()
Remove .keys()
(SIM118)
305-305: Use key in dict
instead of key in dict.keys()
Remove .keys()
(SIM118)
executorlib/standalone/hdf.py
29-29: Use key in dict
instead of key in dict.keys()
Remove .keys()
(SIM118)
executorlib/base/executor.py
43-43: Use key in dict
instead of key in dict.keys()
Remove .keys()
(SIM118)
49-49: Use key in dict
instead of key in dict.keys()
Remove .keys()
(SIM118)
65-65: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
executorlib/interactive/shared.py
26-26: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
🪛 yamllint (1.35.1)
.github/workflows/mypy.yml
[error] 22-22: no new line character at the end of file
(new-line-at-end-of-file)
🔇 Additional comments (73)
executorlib/standalone/interactive/backend.py (2)
6-7
: Signature looks good with explicit Callable and Any.
Switching from callable
to Callable
improves type clarity. Returning Any
is acceptable if the function truly might return any type.
13-13
: Docstring alignment with new type hints.
The docstring accurately reflects the updated parameter type. This consistency is good practice.
executorlib/standalone/serialize.py (1)
38-38
: Docstring consistency.
The docstring now reflects that fn
is a Callable
, which matches the updated signature.
executorlib/backend/interactive_parallel.py (5)
4-4
: Good addition of type hints.
Importing Optional
here clearly communicates that certain values may be None
.
7-7
: Dependency import is appropriate.
Introducing zmq
aligns with usage of zmq.Context
and zmq.Socket
types.
29-29
: No issues with custom MPI pickle initialization.
The # type: ignore
comment is acceptable to override MyPy errors here. Ensure that the relevant code is tested for MPI rank-based operations.
38-39
: Proper addition of optional context and socket.
Declaring context: Optional[zmq.Context] = None
and socket: Optional[zmq.Socket] = None
is a clear improvement in type safety.
54-54
: Initialization of the input_dict
.
Using input_dict: dict = {}
ensures a valid dictionary is always broadcasted.
executorlib/standalone/hdf.py (5)
2-2
: Expanded usage of type hints.
Importing Any
, Optional
, and Tuple
aligns well with the updated function signatures below.
18-18
: Allowing file_name
to be optional.
Changing file_name
to Optional[str]
handles cases where the file might not be specified.
26-26
: Defensive check for None value.
Validating that file_name
is not None
before file operations helps avoid runtime errors.
63-63
: Flexible return type for get_output
.
Returning Any
for the second tuple element is appropriate if the output can be any Python object.
97-105
: Optional file name logic is correct.
Falling back to None
when the file name is not provided is a valid design choice.
executorlib/interactive/slurm.py (1)
78-78
: Improved flexibility for cwd
.
Accepting None
for cwd
expands usability in contexts where a working directory isn't strictly required.
executorlib/cache/queue_spawner.py (1)
19-19
: Optional queue ID return type.
Changing the return type to Optional[int]
accommodates scenarios where a job submission may not occur or fails. Confirm that callers handle a None
return.
executorlib/cache/executor.py (4)
2-2
: Good practice importing Callable and Optional explicitly.
This fosters clarity and consistency across the codebase.
46-47
: Update docstring to match new default.
Docstring references execute_in_subprocess
as the default. Ensure it aligns with the actual default in the code (execute_with_pysqa
) unless it is correctly changed later.
84-86
: Allowing None for max_workers and max_cores broadens flexibility.
Make sure the rest of the code handles None
as intended when scheduling tasks.
96-96
: Confirm the constraints on init_function.
The function currently raises an error if init_function
is specified. Ensure that’s the intended design or consider implementing the init functionality instead of raising an error.
executorlib/interactive/flux.py (1)
120-126
: Prevent attribute errors by checking self._future.
This guard is a welcome improvement, ensuring methods aren’t called on None
.
executorlib/base/executor.py (6)
8-8
: Imported types look good.
Type imports from typing
for Callable
, List
, Optional
, and Union
increase clarity and make the code easier to maintain.
30-31
: Flexible typing for _future_queue
and _process
.
Marking these fields as Optional[...]
is a good improvement, allowing for clearer checks before usage.
56-56
: Consistent return type for future_queue
.
Returning Optional[queue.Queue]
matches how _future_queue
is managed and prevents runtime errors.
100-110
: Queue usage verified.
Verifying _future_queue
is not None
before put
is solid and prevents errors.
128-132
: Graceful shutdown.
The checks for None
before canceling or shutting down maintain robust error handling.
155-158
: Returning a qsize
or zero.
Returning 0
if _future_queue
is None
avoids potential AttributeError
and is consistent with the type signature.
executorlib/standalone/inputcheck.py (3)
103-118
: Optional pmi
handling is correct.
The updated checks ensure that invalid or missing pmi
values raise the appropriate errors.
Line range hint 181-200
: Core validation logic.
Using cores_per_worker=1
by default is sensible. The branching structure for max_cores
and max_workers
looks robust.
203-203
: File existence checks are thorough.
Raising ValueError
if file_name
is None
or missing on disk is a clear and concise approach.
executorlib/standalone/interactive/communication.py (3)
178-187
: Allowing None
sockets for interface_send
.
Ensures safe handling if the socket is missing. This prevents potential AttributeError
calls on socket
.
190-200
: Safe fallback for interface_receive
.
Returning an empty dict when socket
is None
avoids runtime errors and is gracefully handled.
203-213
: Robust interface_shutdown
.
The checks prevent calls on None
objects. This gracefully accommodates partial or no socket/context usage.
tests/test_integration_pyiron_workflow.py (3)
12-12
: Switch to Callable
import is consistent with type-hinting conventions.
25-25
: Constructor type annotation.
Improving from callable
to Callable
is a good practice for explicit method signatures.
48-48
: Decorator signature refined.
The updated as_dynamic_foo(fnc: Callable)
approach clarifies usage and aligns with the rest of the codebase.
executorlib/cache/shared.py (3)
6-6
: Good use of explicit imports for better type clarity.
No issues found.
52-54
: Switch from callable
to Callable
is a best practice.
This enhances static type checking and clarity.
68-69
: Docstring updates reflect the new type annotations accurately.
Everything looks consistent.
executorlib/interactive/executor.py (11)
92-92
: # type: ignore
may mask useful warnings.
Use it sparingly and justify its presence.
103-103
: Well-documented function parameter.
No issues found.
149-149
: # type: ignore
on exit call.
Please verify if ignoring type checks here is truly needed.
176-176
: Switch from Optional[callable]
to Optional[Callable]
is correct.
No issues found.
221-221
: Clarity enhancement: cores_per_worker
assignment.
No further issues.
225-227
: Good checks for oversubscribe flags.
Logic is correct.
229-229
: Retrieving SLURM command args with a default.
No issues found.
250-250
: Ensuring threads_per_core
has default.
Looks good.
279-279
: Similar logic for SLURM.
Matches existing pattern.
297-297
: Retrieving GPU data with default zero.
Implementation is correct.
299-299
: Consistent approach for slurm_cmd_args
.
No issues here.
executorlib/__init__.py (4)
1-1
: Imports for Callable
and Optional
.
This is consistent with updated type usage.
19-19
: Explicitly defining __all__
type.
Improves clarity for exported symbols.
103-103
: Parameter type updated to Optional[Callable]
.
Enhances type correctness.
126-126
: Same improvement in __new__
.
Stays consistent.
executorlib/interactive/shared.py (20)
8-8
: Comprehensive type imports.
No issues found.
34-34
: Accurate docstring for fn
parameter.
Matches the function signature.
53-57
: Creating and passing a new Future
.
Implementation looks correct.
74-83
: Enhanced shutdown logic with queue checks.
Ensures _future_queue
is valid. Properly handles wait conditions.
87-87
: Type for _set_process
method parameter.
Clear usage of List[RaisingThread]
.
95-96
: Starting all processes in a loop.
Good approach. No issues.
137-137
: spawner: type[BaseSpawner] = MpiExecSpawner
Type annotation helps clarity.
190-190
: Same spawner annotation in InteractiveStepExecutor
.
Consistency maintained.
208-208
: Matches usage in execute_parallel_tasks
.
No issues found.
228-228
: Slight docstring refinement describing init_function
.
Looks consistent.
265-265
: spawner
annotation repeated for separate tasks.
Remains consistent.
303-303
: Local queue creation for tasks.
No issues found.
461-461
: Helper function _update_futures_in_input
improved.
Signature is well-defined.
473-473
: get_result
handles both single Future and list of Futures.
No issues found.
481-481
: Inline comprehension is concise and clear.
Implementation looks good.
492-492
: Docstring clarifies the expected task dictionary shape.
Enhances maintainability.
518-518
: Same approach with typed spawner.
Ensures consistent usage.
528-528
: Docstring describing dictionary structure is helpful.
No issues found.
590-590
: Docstring details dictionary shape for _execute_task
.
Keeps usage clear.
618-618
: Docstring clarifies caching approach.
Implementation and usage are correct.
@@ -29,13 +29,13 @@ def cloudpickle_register(ind: int = 2): | |||
|
|||
|
|||
def serialize_funct_h5( | |||
fn: callable, fn_args: list = [], fn_kwargs: dict = {}, resource_dict: dict = {} | |||
fn: Callable, fn_args: list = [], fn_kwargs: dict = {}, resource_dict: dict = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid mutable default arguments.
Using mutable types (list
or dict
) in function defaults can lead to subtle bugs due to shared references. Consider initializing them inside the function:
-def serialize_funct_h5(
- fn: Callable, fn_args: list = [], fn_kwargs: dict = {}, resource_dict: dict = {}
-) -> Tuple[str, dict]:
+from typing import Callable, Tuple, Optional, Dict, List
+def serialize_funct_h5(
+ fn: Callable,
+ fn_args: Optional[List] = None,
+ fn_kwargs: Optional[Dict] = None,
+ resource_dict: Optional[Dict] = None,
+) -> Tuple[str, dict]:
+ if fn_args is None:
+ fn_args = []
+ if fn_kwargs is None:
+ fn_kwargs = {}
+ if resource_dict is None:
+ resource_dict = {}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
fn: Callable, fn_args: list = [], fn_kwargs: dict = {}, resource_dict: dict = {} | |
from typing import Callable, Tuple, Optional, Dict, List | |
def serialize_funct_h5( | |
fn: Callable, | |
fn_args: Optional[List] = None, | |
fn_kwargs: Optional[Dict] = None, | |
resource_dict: Optional[Dict] = None, | |
) -> Tuple[str, dict]: | |
if fn_args is None: | |
fn_args = [] | |
if fn_kwargs is None: | |
fn_kwargs = {} | |
if resource_dict is None: | |
resource_dict = {} |
🧰 Tools
🪛 Ruff (0.8.2)
32-32: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
32-32: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
32-32: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
@@ -62,7 +62,7 @@ def future_queue(self) -> queue.Queue: | |||
""" | |||
return self._future_queue | |||
|
|||
def submit(self, fn: callable, *args, resource_dict: dict = {}, **kwargs) -> Future: | |||
def submit(self, fn: Callable, *args, resource_dict: dict = {}, **kwargs) -> Future: # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid mutable defaults for resource_dict
.
It’s a best practice to avoid using mutable objects (dict
, list
) as default parameters.
-def submit(self, fn: Callable, *args, resource_dict: dict = {}, **kwargs) -> Future:
+def submit(self, fn: Callable, *args, resource_dict: Optional[dict] = None, **kwargs) -> Future:
...
+ if resource_dict is None:
+ resource_dict = {}
Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 Ruff (0.8.2)
65-65: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
@@ -166,14 +166,14 @@ def create_executor( | |||
backend: str = "local", | |||
max_cores: Optional[int] = None, | |||
cache_directory: Optional[str] = None, | |||
resource_dict: Optional[dict] = None, | |||
resource_dict: dict = {}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid using mutable default argument for resource_dict
.
This can lead to subtle bugs. Consider:
- def create_executor(..., resource_dict: dict = {}, ...):
+ def create_executor(..., resource_dict: Optional[dict] = None, ...):
if resource_dict is None:
resource_dict = {}
Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 Ruff (0.8.2)
169-169: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
@@ -23,15 +23,15 @@ | |||
|
|||
|
|||
class ExecutorBroker(ExecutorBase): | |||
def submit(self, fn: callable, *args, resource_dict: dict = {}, **kwargs) -> Future: | |||
def submit(self, fn: Callable, *args, resource_dict: dict = {}, **kwargs) -> Future: # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid mutable argument default (resource_dict: dict = {}
).
This can cause unexpected behavior when reused. Consider:
- def submit(self, fn: Callable, *args, resource_dict: dict = {}, **kwargs) -> Future:
+ def submit(self, fn: Callable, *args, resource_dict: Optional[dict] = None, **kwargs) -> Future:
Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 Ruff (0.8.2)
26-26: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
python-version: "3.13" | ||
architecture: x64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using a stable Python version
Python 3.13 is currently in development/alpha and not recommended for CI environments. Consider using the latest stable version (3.12) instead.
- python-version: "3.13"
+ python-version: "3.12"
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
python-version: "3.13" | |
architecture: x64 | |
python-version: "3.12" | |
architecture: x64 |
- name: Checkout | ||
uses: actions/checkout@v4 | ||
- name: Install mypy | ||
run: pip install mypy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Pin MyPy version for consistent CI results
Installing MyPy without version pinning could lead to inconsistent results as new versions are released.
- run: pip install mypy
+ run: pip install 'mypy==1.8.0'
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
run: pip install mypy | |
run: pip install 'mypy==1.8.0' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
executorlib/base/executor.py (1)
65-65
:⚠️ Potential issueFix mutable default argument in submit method
While the type annotations and null checks are good improvements, the mutable default argument issue remains.
-def submit(self, fn: Callable, *args, resource_dict: dict = {}, **kwargs) -> Future: +def submit(self, fn: Callable, *args, resource_dict: Optional[dict] = None, **kwargs) -> Future: ... + if resource_dict is None: + resource_dict = {}Also applies to: 100-110
🧰 Tools
🪛 Ruff (0.8.2)
65-65: Do not use mutable data structures for argument defaults
Replace with
None
; initialize within function(B006)
🧹 Nitpick comments (2)
executorlib/backend/cache_parallel.py (1)
Line range hint
28-32
: Document reason for type ignoreThe type ignore comment is valid as MPI libraries often lack proper type hints. However, it would be helpful to document why this type ignore is necessary.
Consider adding a brief comment explaining the type ignore:
- MPI.pickle.__init__( # type: ignore + # type: ignore - MPI.pickle lacks proper type hints but this usage is safe + MPI.pickle.__init__(executorlib/base/executor.py (1)
42-43
: Optimize dictionary key checksWhile the change to
get_kwargs()
improves encapsulation, the dictionary key checks can be simplified.-if "future_queue" in meta_data_dict.keys(): +if "future_queue" in meta_data_dict:Also applies to: 48-49
🧰 Tools
🪛 Ruff (0.8.2)
43-43: Use
key in dict
instead ofkey in dict.keys()
Remove
.keys()
(SIM118)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
executorlib/backend/cache_parallel.py
(3 hunks)executorlib/base/executor.py
(7 hunks)executorlib/standalone/hdf.py
(4 hunks)executorlib/standalone/inputcheck.py
(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- executorlib/standalone/inputcheck.py
🧰 Additional context used
🪛 Ruff (0.8.2)
executorlib/backend/cache_parallel.py
4-4: typing.Any
imported but unused
Remove unused import: typing.Any
(F401)
executorlib/base/executor.py
43-43: Use key in dict
instead of key in dict.keys()
Remove .keys()
(SIM118)
49-49: Use key in dict
instead of key in dict.keys()
Remove .keys()
(SIM118)
65-65: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
executorlib/standalone/hdf.py
29-29: Use key in dict
instead of key in dict.keys()
Remove .keys()
(SIM118)
🔇 Additional comments (9)
executorlib/backend/cache_parallel.py (1)
38-42
: LGTM! Type-safe initialization of apply_dict
The initialization of apply_dict
as an empty dictionary before the conditional improves type safety by ensuring consistent types across all code paths. This is particularly important for the MPI broadcast operation.
Let's verify that this change doesn't affect MPI behavior:
✅ Verification successful
Initialization of apply_dict as empty dictionary is safe for MPI broadcast
The codebase search reveals consistent patterns of MPI broadcast operations with dictionaries:
- In
interactive_parallel.py
, a similar pattern exists where a dictionary is broadcast and then accessed - Both cases show the broadcast dictionary being immediately used to access keys/values
- The empty dictionary initialization won't affect the broadcast operation since:
- Only rank 0 populates the actual data
- Other ranks receive the populated dictionary through broadcast
- The empty dict is overwritten by the broadcast operation
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for similar MPI broadcast patterns in the codebase to confirm this is a safe change
rg -A 5 "MPI\.COMM_WORLD\.bcast.*dict.*" --type py
Length of output: 1068
executorlib/standalone/hdf.py (3)
2-2
: LGTM: Type imports are complete and well-organized
The added type imports properly support the type annotations introduced in this PR.
63-63
: LGTM: Improved type precision
The return type change from object
to Any
better expresses the function's capability to return any type as the second tuple element.
97-102
: LGTM: Robust null handling implementation
The changes properly handle the optional input parameter and maintain type safety throughout the function.
Let's verify the callers can handle None return values:
✅ Verification successful
Let me check the handling of the return value in the queue_spawner.py file.
Null handling is properly implemented across the codebase
The verification shows that the callers handle the None return value correctly:
- In
queue_spawner.py
, the code explicitly checksif queue_id is None
and handles this case appropriately - The test cases in
test_cache_hdf.py
verify the positive case (returning 123)
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for get_queue_id calls to ensure proper null handling
rg "get_queue_id\(" -A 2
Length of output: 990
Script:
#!/bin/bash
# Get more context around the queue_spawner usage
rg "queue_id = get_queue_id" -B 3 -A 10 executorlib/cache/queue_spawner.py
Length of output: 602
executorlib/base/executor.py (5)
8-8
: LGTM: Appropriate typing imports added
The new imports are correctly specified and necessary for the type hints being added throughout the file.
56-56
: LGTM: Return type accurately reflects nullable state
The Optional type annotation correctly indicates that the queue might be None.
128-132
: LGTM: Improved null safety and type checking
The added null checks for _future_queue
and type check for _process
improve the robustness of the shutdown process.
155-158
: LGTM: Proper null handling in length calculation
The method now safely handles the case where _future_queue
is None by returning 0.
30-31
: LGTM: Type annotations accurately reflect nullable states and usage patterns
The type annotations have been improved to:
- Handle nullable states with Optional
- Support both single and multiple RaisingThread instances
Let's verify the usage patterns of _process
:
✅ Verification successful
Let me gather more information to verify the usage patterns of _process
and its type consistency.
Type annotation for _process
correctly reflects its usage patterns
The verification confirms that the type annotation Optional[Union[RaisingThread, List[RaisingThread]]]
accurately matches the implementation:
- In
executorlib/base/executor.py
,_process
is used both as a singleRaisingThread
and as a list ofRaisingThread
instances:- Single thread usage:
self._process.start()
- List usage:
isinstance(self._process, list)
with iteration overself._process
- Single thread usage:
- Null state is handled: Multiple
self._process = None
assignments - Type-specific operations are properly guarded:
- List operations are protected by
isinstance(self._process, list)
- Single thread operations are protected by
isinstance(self._process, RaisingThread)
- List operations are protected by
Note: The type annotations in other files (like spawner.py
and communication.py
) are different as they belong to different classes with different purposes.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Verify that _process is used consistently with its type annotation
ast-grep --pattern 'self._process = $_'
Length of output: 978
Script:
#!/bin/bash
# Check the class definition and inheritance for RaisingThread usage
ast-grep --pattern 'class $_ {
$$$
_process: $_
$$$
}'
# Check assignments of list type to _process
ast-grep --pattern 'self._process = [$$$]'
# Get more context around the process assignments
rg "self._process" -B 2 -A 2
Length of output: 6771
@@ -1,6 +1,7 @@ | |||
import pickle | |||
import sys | |||
import time | |||
from typing import Any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused import
The Any
type from typing
module is imported but never used in the code.
Apply this diff to remove the unused import:
-from typing import Any
🧰 Tools
🪛 Ruff (0.8.2)
4-4: typing.Any
imported but unused
Remove unused import: typing.Any
(F401)
def dump(file_name: Optional[str], data_dict: dict) -> None: | ||
""" | ||
Dump data dictionary into HDF5 file | ||
|
||
Args: | ||
file_name (str): file name of the HDF5 file as absolute path | ||
data_dict (dict): dictionary containing the python function to be executed {"fn": ..., "args": (), "kwargs": {}} | ||
""" | ||
with h5py.File(file_name, "a") as fname: | ||
for data_key, data_value in data_dict.items(): | ||
if data_key in group_dict.keys(): | ||
fname.create_dataset( | ||
name="/" + group_dict[data_key], | ||
data=np.void(cloudpickle.dumps(data_value)), | ||
) | ||
if file_name is not None: | ||
with h5py.File(file_name, "a") as fname: | ||
for data_key, data_value in data_dict.items(): | ||
if data_key in group_dict.keys(): | ||
fname.create_dataset( | ||
name="/" + group_dict[data_key], | ||
data=np.void(cloudpickle.dumps(data_value)), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Making file_name
optional in dump()
needs coordinated changes
The verification reveals an inconsistency in the codebase:
- All current callers of
dump()
provide afile_name
explicitly using keyword arguments - The paired
load()
function still expectsfile_name: str
(non-optional) - Making
dump()
acceptNone
without similar changes toload()
creates an asymmetric API
This change could lead to confusion where:
- Data could be silently not dumped when
None
is passed - The corresponding
load()
call would fail as it doesn't handleNone
Recommendations:
- Either keep
file_name
as requiredstr
in both functions - Or make both
dump()
andload()
handleOptional[str]
consistently - Document the behavior when
None
is passed to help users understand the implications
🔗 Analysis chain
Type safety improvement and minor optimization needed
The type annotation change to Optional[str]
is well-implemented with proper null checking. However, there's a minor optimization opportunity in the dictionary key check.
Apply this optimization:
- if data_key in group_dict.keys():
+ if data_key in group_dict:
Let's verify the impact of making file_name
optional:
Let me verify the usage of the load
function since it might be affected by the optional file_name
parameter change.
Let me try a different approach to check the load function implementation.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for direct calls to dump() to ensure callers can handle None
rg -l "dump\(" | xargs rg "dump\([^,]+" -A 1
Length of output: 1367
Script:
#!/bin/bash
# Check the load function implementation and its usage
ast-grep --pattern 'def load($$$)'
Length of output: 36
Script:
#!/bin/bash
# Search for load function implementation and its usage
rg "def load" -A 5
Length of output: 362
🧰 Tools
🪛 Ruff (0.8.2)
29-29: Use key in dict
instead of key in dict.keys()
Remove .keys()
(SIM118)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
tests/test_shared_communication.py (1)
89-90
: Use a more explicit assertion for clarity.Instead of checking only the length, you could verify that the returned object is indeed an empty dictionary. This approach prevents ambiguity if the function's return type changes in the future.
Consider applying this change:
- self.assertEqual(len(interface_receive(socket=None)), 0) + self.assertDictEqual(interface_receive(socket=None), {})
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
executorlib/standalone/inputcheck.py
(2 hunks)tests/test_shared_communication.py
(1 hunks)tests/test_shared_input_check.py
(0 hunks)
💤 Files with no reviewable changes (1)
- tests/test_shared_input_check.py
🚧 Files skipped from review as they are similar to previous changes (1)
- executorlib/standalone/inputcheck.py
Summary by CodeRabbit
New Features
RaisingThread
class to access keyword arguments.None
socket.Bug Fixes
None
values.Documentation
Refactor
Optional
andCallable
for enhanced type safety.