Delete SubprocessExecutor and ShellExecutor #440

jan-janssen · 2024-10-27T10:53:57Z

The aim is to unify the interface, the unit tests were updated to demonstrate how both can be realized with the existing general Executor class

Summary by CodeRabbit

Release Notes

New Features
- Introduced a more flexible Executor class for task execution.
- Added helper functions for command submission and subprocess interaction.
Bug Fixes
- Improved error handling in task execution and parameter validation.
Documentation
- Updated test cases to reflect changes in executor usage and interaction methods.
Chores
- Removed deprecated executors and associated files to streamline the public API.

The aim is to unify the interface, the unit tests were updated to demonstrate how both can be realized with the existing general Executor class

for more information, see https://pre-commit.ci

coderabbitai · 2024-10-27T10:54:04Z

Walkthrough

The changes in this pull request involve the removal of the SubprocessExecutor and ShellExecutor from the executorlib module, leading to significant modifications in the public API. The __all__ variable in executorlib/__init__.py is now empty, indicating that these executors are no longer exported. The code also introduces a new method for creating executors with enhanced error handling based on dependency management. Additionally, corresponding test files have been updated to reflect these changes, replacing references to the removed executors with a more generic Executor class.

Changes

File	Change Summary
executorlib/init.py	- Removed `SubprocessExecutor` and `ShellExecutor` from `__all__`. - Added `create_executor(...)` method.
executorlib/shell/init.py	- Deleted file containing imports and `__all__` declaration for `SubprocessExecutor` and `ShellExecutor`.
executorlib/shell/executor.py	- Removed `SubprocessExecutor` class and its methods. - Deleted `execute_single_task` function.
executorlib/shell/interactive.py	- Removed `ShellExecutor` class and its methods.
tests/test_shell_executor.py	- Replaced `SubprocessExecutor` with `Executor`. - Added `submit_shell_command` function.
tests/test_shell_interactive.py	- Replaced `ShellExecutor` with `Executor`. - Added `init_process`, `interact`, and `shutdown` functions.

Possibly related issues

[documentation] Remove ShellExecutor and SubprocessExecutor from documentation #441: This PR addresses the removal of ShellExecutor and SubprocessExecutor from the codebase, aligning with the issue's request to update documentation accordingly.

🐰 In the meadow, we hop and play,
The old executors have gone away.
With new paths to explore, we cheer,
A fresh start brings us near!
Let’s create and execute with glee,
In our code, we’re wild and free! 🌼

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (3)

executorlib/__init__.py (2)
Line range hint 24-24: Add return type hints to class methods.

The __new__ method should specify its return type for better type safety and IDE support.
    def __new__(
        cls,
        max_workers: int = 1,
-       backend: str = "local",
+       backend: str = "local",
    ) -> "ExecutorWithDependencies | Any":
Also applies to: 142-142

Line range hint 142-207: Consider optimizing parameter validation.

The __new__ method performs parameter validation only when disable_dependencies=True. Consider moving common parameter validations (like max_workers, cores_per_worker, etc.) before the conditional branch to ensure consistent validation regardless of the dependency setting.

Example refactor:
    def __new__(...):
+       # Common parameter validation
+       _check_max_workers(max_workers)
+       _check_cores(max_cores, cores_per_worker)
+       
        if not disable_dependencies:
            return ExecutorWithDependencies(...)
        else:
            _check_plot_dependency_graph(plot_dependency_graph=plot_dependency_graph)
            _check_refresh_rate(refresh_rate=refresh_rate)
            return create_executor(...)
tests/test_shell_interactive.py (1)
21-21: Consider using text=True instead of universal_newlines=True in subprocess.Popen

In Python 3.7 and above, text=True is preferred over universal_newlines=True for improved clarity.

Apply this diff:
             stdout=subprocess.PIPE,
-            universal_newlines=True,
+            text=True,
             shell=False,

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 8ce73ec and 4e2e033.

📒 Files selected for processing (6)

executorlib/init.py (1 hunks)
executorlib/shell/init.py (0 hunks)
executorlib/shell/executor.py (0 hunks)
executorlib/shell/interactive.py (0 hunks)
tests/test_shell_executor.py (2 hunks)
tests/test_shell_interactive.py (1 hunks)

💤 Files with no reviewable changes (3)

executorlib/shell/init.py
executorlib/shell/executor.py
executorlib/shell/interactive.py

🧰 Additional context used

🪛 Ruff

tests/test_shell_interactive.py

36-39: Combine if branches using logical or operator

Combine if branches

(SIM114)

🔇 Additional comments (8)

tests/test_shell_executor.py (8)

6-8: Import statements updated correctly

The imports have been updated to reflect the removal of SubprocessExecutor and ShellExecutor. The necessary classes and functions are now imported from executorlib and its submodules.

25-31: Test updated to use the generic Executor

The test_execute_single_task function has been updated to use the general Executor class and the new submit_shell_command function. This aligns with the PR objective to unify executor interfaces.

46-54: Correct exception handling for invalid keyword arguments

In test_wrong_error, the test now expects a TypeError when incorrect keyword arguments are provided. This is appropriate, as passing unexpected keyword arguments should raise a TypeError.

68-71: Proper testing of non-existent executables

The test_broken_executable function correctly tests the scenario where an invalid executable path is provided, and it expects a FileNotFoundError. This ensures that the code handles such errors gracefully.

83-89: Consistent use of Executor with shell command submission

In test_shell_static_executor_args, the test appropriately uses the Executor class with the submit_shell_command function. The parameters passed are consistent with the function definition.

95-101: Verification of binary output when universal_newlines=False

The test_shell_static_executor_binary test checks that when universal_newlines is set to False, the output is in bytes. This correctly verifies the behavior of the submit_shell_command function under different settings.

107-110: Inconsistent type usage with command parameter

In test_shell_static_executor_shell, a string is passed to submit_shell_command with shell=True, but the function expects command to be a list. This inconsistency can be resolved by updating the type annotation as previously suggested.

116-128: Efficient parallel task execution using Executor

The test_shell_executor function demonstrates the execution of multiple shell commands in parallel using the Executor class. The implementation is correct and effectively tests the executor's ability to handle concurrent tasks.

coderabbitai · 2024-10-27T11:03:23Z

executorlib/__init__.py

-    SubprocessExecutor,
-    ShellExecutor,
-]
+__all__ = []


💡 Codebase verification

Based on the search results, I can now generate the final response:

Breaking change in public API requires documentation updates

The empty __all__ list removes SubprocessExecutor and ShellExecutor from the public API. The search results show these classes are extensively documented in docs/development.md with examples and usage patterns, but this documentation is now outdated.

Update docs/development.md to reflect the removal of SubprocessExecutor and ShellExecutor

Add migration guide explaining how to transition from the removed executors to the new unified Executor class

Consider adding a deprecation notice in the documentation

🔗 Analysis chain

Document breaking changes in public API.

The empty __all__ list removes previously exported classes (SubprocessExecutor and ShellExecutor) from the public API. This is a breaking change that could affect existing code.

Let's verify the impact:

Consider:

Adding migration guide in documentation

Providing deprecation warnings in previous version

Updating version number to reflect breaking change (major version bump)

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Search for usage of removed executors in the codebase echo "Searching for SubprocessExecutor usage..." rg "SubprocessExecutor" echo "Searching for ShellExecutor usage..." rg "ShellExecutor"

Length of output: 2282

coderabbitai · 2024-10-27T11:03:23Z

tests/test_shell_interactive.py

+        if stop_read_pattern is not None and stop_read_pattern in output_current:
+            break
+        elif lines_to_read is not None and lines_to_read == lines_count:
+            break


🛠️ Refactor suggestion

Simplify conditional statements by combining 'if' branches using logical or operator

Combining the if and elif conditions into a single if statement enhances readability and reduces code complexity.

Apply this diff to combine the conditions:

while True: output_current = process.stdout.readline() output += output_current lines_count += 1 - if stop_read_pattern is not None and stop_read_pattern in output_current: - break - elif lines_to_read is not None and lines_to_read == lines_count: + if (stop_read_pattern is not None and stop_read_pattern in output_current) or \ + (lines_to_read is not None and lines_to_read == lines_count): break

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if stop_read_pattern is not None and stop_read_pattern in output_current:

break

elif lines_to_read is not None and lines_to_read == lines_count:

break

if (stop_read_pattern is not None and stop_read_pattern in output_current) or \

(lines_to_read is not None and lines_to_read == lines_count):

break

🧰 Tools

🪛 Ruff

36-39: Combine if branches using logical or operator

Combine if branches

(SIM114)

coderabbitai · 2024-10-27T11:03:23Z

tests/test_shell_executor.py

+def submit_shell_command(
+    command: list, universal_newlines: bool = True, shell: bool = False
+):
+    return subprocess.check_output(
+        command, universal_newlines=universal_newlines, shell=shell
+    )


⚠️ Potential issue

Update type annotation for command parameter

The submit_shell_command function currently annotates the command parameter as a list, but in some test cases, a string is passed when shell=True. To accommodate both strings and lists as valid input types for command, consider updating the type annotation.

Apply this diff to modify the type annotation:

+from typing import Union, List -def submit_shell_command( - command: list, universal_newlines: bool = True, shell: bool = False +def submit_shell_command( + command: Union[List[str], str], universal_newlines: bool = True, shell: bool = False ): return subprocess.check_output( command, universal_newlines=universal_newlines, shell=shell )

This change ensures that the function accepts both list and str types for the command parameter, aligning with how subprocess.check_output handles input when shell=True.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def submit_shell_command(

command: list, universal_newlines: bool = True, shell: bool = False

):

return subprocess.check_output(

command, universal_newlines=universal_newlines, shell=shell

)

from typing import Union, List

def submit_shell_command(

command: Union[List[str], str], universal_newlines: bool = True, shell: bool = False

):

return subprocess.check_output(

command, universal_newlines=universal_newlines, shell=shell

)

coderabbitai

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (2)

tests/test_shell_executor.py (2)
11-16: Consider adding error handling to the helper function.

The submit_shell_command helper function could benefit from basic error handling to provide more meaningful error messages when command execution fails.
 def submit_shell_command(
     command: list, universal_newlines: bool = True, shell: bool = False
 ):
+    try:
         return subprocess.check_output(
             command, universal_newlines=universal_newlines, shell=shell
         )
+    except subprocess.CalledProcessError as e:
+        raise RuntimeError(f"Command failed with exit code {e.returncode}: {e.output}")
Line range hint 1-146: Well-structured transition to unified executor interface.

The test suite successfully demonstrates how the functionality previously provided by SubprocessExecutor can be achieved using the generic Executor class. The changes maintain test coverage while simplifying the overall architecture.

A few suggestions to consider:

Add docstrings to document the new execution patterns

Consider creating shared test utilities for common executor setup code

Add test cases for concurrent execution with different worker counts

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 4e2e033 and 6a17373.

📒 Files selected for processing (1)

tests/test_shell_executor.py (2 hunks)

🔇 Additional comments (3)

tests/test_shell_executor.py (3)
6-8: LGTM: Import changes align with PR objectives.

The transition from specific executor imports to the generic Executor and related utilities aligns well with the PR's goal of unifying the executor interface.

55-61: LGTM: Improved error handling.

The change from KeyError to TypeError for invalid kwargs is more appropriate as it better reflects the nature of the error (type mismatch rather than missing key).

34-39: Verify parallel execution configuration.

The transition to execute_parallel_tasks introduces new parameters. Ensure that cores=1 with openmpi_oversubscribe=False provides equivalent behavior to the previous sequential execution.

✅ Verification successful

The parallel execution configuration is correctly implemented.

The code review concern can be resolved. The execute_parallel_tasks function is consistently used across test files with cores=1 and openmpi_oversubscribe=False for sequential execution testing. This configuration ensures:

Single core execution (cores=1) which maintains sequential behavior

No oversubscription, preventing resource conflicts

Consistent usage pattern across all test files (test_shell_executor.py, test_shell_interactive.py, test_local_executor.py)

The implementation in the executor.py shows this is the intended way to run sequential tasks through the parallel execution framework.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Search for other test files using similar configuration to verify consistency
rg -l 'execute_parallel_tasks.*cores.*openmpi_oversubscribe'
Length of output: 62

Script:
#!/bin/bash
# Let's search for all occurrences of execute_parallel_tasks to understand its usage
rg -A 5 "execute_parallel_tasks"

# Also search for the function definition to understand expected parameters
ast-grep --pattern 'def execute_parallel_tasks($$$)'
Length of output: 7108

Script:
#!/bin/bash
# Let's look at the function definition and its parameters in more detail
cat executorlib/shared/executor.py
Length of output: 28333

coderabbitai · 2024-10-27T12:17:30Z

tests/test_shell_executor.py

+        with Executor(max_workers=2) as exe:
+            cloudpickle_register(ind=1)
+            f_1 = exe.submit(
+                submit_shell_command, ["echo", "test_1"], universal_newlines=True
+            )
+            f_2 = exe.submit(
+                submit_shell_command, ["echo", "test_2"], universal_newlines=True
+            )
+            f_3 = exe.submit(
+                submit_shell_command, ["echo", "test_3"], universal_newlines=True
+            )
+            f_4 = exe.submit(
+                submit_shell_command, ["echo", "test_4"], universal_newlines=True
+            )


🛠️ Refactor suggestion

Consider parameterizing the parallel execution test.

The parallel execution test could be made more maintainable using parameterized testing. This would make it easier to add more test cases and reduce code duplication.

@pytest.mark.parametrize("test_input,expected", [ ("test_1", "test_1\n"), ("test_2", "test_2\n"), ("test_3", "test_3\n"), ("test_4", "test_4\n"), ]) def test_shell_executor_parameterized(test_input, expected): with Executor(max_workers=2) as exe: cloudpickle_register(ind=1) future = exe.submit( submit_shell_command, ["echo", test_input], universal_newlines=True ) assert not future.done() assert future.result() == expected assert future.done()

coderabbitai

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)

tests/test_shell_interactive.py (1)

48-102: Consider adding error case tests.

The test_execute_single_task method effectively tests the happy path, but consider adding tests for error scenarios such as:

Invalid input

Process termination before shutdown

Pattern not found

Zero or negative line count

Would you like me to help generate additional test cases for these scenarios?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 6a17373 and a3c8eac.

📒 Files selected for processing (1)

tests/test_shell_interactive.py (1 hunks)

🧰 Additional context used

🪛 Ruff

tests/test_shell_interactive.py

36-39: Combine if branches using logical or operator

Combine if branches

(SIM114)

🔇 Additional comments (4)

tests/test_shell_interactive.py (4)

1-13: LGTM! Import changes align with PR objectives.

The imports have been correctly updated to reflect the transition from specialized executors to the generic Executor class.

15-24: LGTM! Well-structured process initialization.

The init_process function correctly configures the subprocess with appropriate parameters and returns the expected dictionary format.

43-46: LGTM! Clean shutdown implementation.

The shutdown function properly handles process termination by writing and flushing the shutdown command.

104-127: LGTM! Well-structured executor context usage.

The test demonstrates proper usage of the Executor context manager and appropriate verification of task completion. The shutdown handling ensures clean test cleanup.

coderabbitai

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)

tests/test_shell_executor.py (1)

85-92: LGTM! Comprehensive coverage of execution modes.

The tests effectively demonstrate how the general Executor class replaces ShellExecutor functionality across different execution modes (args, binary, shell).

Consider adding docstrings to explain the purpose of each test case and what specific ShellExecutor functionality it replaces.

Also applies to: 98-105, 111-115

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between a3c8eac and 905953c.

📒 Files selected for processing (1)

tests/test_shell_executor.py (2 hunks)

🔇 Additional comments (5)

tests/test_shell_executor.py (5)
6-8: LGTM! Import changes align with PR objective.

The changes correctly replace SubprocessExecutor imports with the general Executor class and related utilities, which aligns with the PR's goal of unifying the executor interface.

77-82: LGTM! Error handling test properly adapted.

The test maintains its purpose of verifying FileNotFoundError while properly adapting to the new execution framework.

121-134: LGTM! Effective demonstration of parallel execution.

The test successfully demonstrates how the general Executor handles parallel task execution, properly replacing the parallel execution capabilities of the removed executors.

56-62: Verify error type change alignment with Executor implementation.

The test now expects TypeError instead of KeyError for invalid arguments. Let's verify this aligns with the new Executor implementation.
#!/bin/bash
# Search for error handling in Executor implementation
rg -A 5 "raise TypeError" 
rg -A 5 "raise KeyError"
34-39: Consider using a simpler execution method for single task test.

The test is using execute_parallel_tasks with MPI configuration for a single task execution, which might be overengineered. Consider if a simpler execution method would be more appropriate for this test case.

jan-janssen and others added 2 commits October 27, 2024 11:53

Delete SubprocessExecutor and ShellExecutor

37651e7

The aim is to unify the interface, the unit tests were updated to demonstrate how both can be realized with the existing general Executor class

[pre-commit.ci] auto fixes from pre-commit.com hooks

8203a3d

for more information, see https://pre-commit.ci

jan-janssen mentioned this pull request Oct 27, 2024

[documentation] Remove ShellExecutor and SubprocessExecutor from documentation #441

Closed

jan-janssen added 2 commits October 27, 2024 11:57

fix executable path

befe209

Merge remote-tracking branch 'origin/remove_shell' into remove_shell

4e2e033

coderabbitai bot reviewed Oct 27, 2024

View reviewed changes

jan-janssen and others added 2 commits October 27, 2024 12:07

execution is too fast

33cc16c

Update test_shell_executor.py

6a17373

coderabbitai bot reviewed Oct 27, 2024

View reviewed changes

Update test_shell_interactive.py

a3c8eac

coderabbitai bot reviewed Oct 27, 2024

View reviewed changes

One more

905953c

coderabbitai bot reviewed Oct 27, 2024

View reviewed changes

jan-janssen merged commit 7273a9f into main Oct 27, 2024
24 checks passed

jan-janssen deleted the remove_shell branch October 27, 2024 13:13

This was referenced Oct 27, 2024

Cache: Terminate processes when closing executor #447

Merged

Executor user interface #458

Closed

coderabbitai bot mentioned this pull request Nov 6, 2024

Set max_cores and max_workers to None #478

Merged

This was referenced Nov 19, 2024

Support threads on local backend #504

Merged

Update readme #506

Merged

This was referenced Dec 20, 2024

Move SLURM to separate module #528

Merged

Fix block allocation with two or more workers hanging on failed function #532

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Delete SubprocessExecutor and ShellExecutor #440

Delete SubprocessExecutor and ShellExecutor #440

jan-janssen commented Oct 27, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 27, 2024 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot Oct 27, 2024

coderabbitai bot Oct 27, 2024

coderabbitai bot Oct 27, 2024

coderabbitai bot left a comment

coderabbitai bot Oct 27, 2024

coderabbitai bot left a comment

coderabbitai bot left a comment

Delete SubprocessExecutor and ShellExecutor #440

Delete SubprocessExecutor and ShellExecutor #440

Conversation

jan-janssen commented Oct 27, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

Release Notes

coderabbitai bot commented Oct 27, 2024 • edited Loading

Walkthrough

Changes

Possibly related issues

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Oct 27, 2024

Choose a reason for hiding this comment

coderabbitai bot Oct 27, 2024

Choose a reason for hiding this comment

coderabbitai bot Oct 27, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Oct 27, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

jan-janssen commented Oct 27, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 27, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)