Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete SubprocessExecutor and ShellExecutor #440

Merged
merged 8 commits into from
Oct 27, 2024
Merged

Delete SubprocessExecutor and ShellExecutor #440

merged 8 commits into from
Oct 27, 2024

Conversation

jan-janssen
Copy link
Member

@jan-janssen jan-janssen commented Oct 27, 2024

The aim is to unify the interface, the unit tests were updated to demonstrate how both can be realized with the existing general Executor class

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced a more flexible Executor class for task execution.
    • Added helper functions for command submission and subprocess interaction.
  • Bug Fixes

    • Improved error handling in task execution and parameter validation.
  • Documentation

    • Updated test cases to reflect changes in executor usage and interaction methods.
  • Chores

    • Removed deprecated executors and associated files to streamline the public API.

jan-janssen and others added 2 commits October 27, 2024 11:53
The aim is to unify the interface, the unit tests were updated to demonstrate how both can be realized with the existing general Executor class
Copy link
Contributor

coderabbitai bot commented Oct 27, 2024

Walkthrough

The changes in this pull request involve the removal of the SubprocessExecutor and ShellExecutor from the executorlib module, leading to significant modifications in the public API. The __all__ variable in executorlib/__init__.py is now empty, indicating that these executors are no longer exported. The code also introduces a new method for creating executors with enhanced error handling based on dependency management. Additionally, corresponding test files have been updated to reflect these changes, replacing references to the removed executors with a more generic Executor class.

Changes

File Change Summary
executorlib/init.py - Removed SubprocessExecutor and ShellExecutor from __all__.
- Added create_executor(...) method.
executorlib/shell/init.py - Deleted file containing imports and __all__ declaration for SubprocessExecutor and ShellExecutor.
executorlib/shell/executor.py - Removed SubprocessExecutor class and its methods.
- Deleted execute_single_task function.
executorlib/shell/interactive.py - Removed ShellExecutor class and its methods.
tests/test_shell_executor.py - Replaced SubprocessExecutor with Executor.
- Added submit_shell_command function.
tests/test_shell_interactive.py - Replaced ShellExecutor with Executor.
- Added init_process, interact, and shutdown functions.

Possibly related issues

🐰 In the meadow, we hop and play,
The old executors have gone away.
With new paths to explore, we cheer,
A fresh start brings us near!
Let’s create and execute with glee,
In our code, we’re wild and free! 🌼


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (3)
executorlib/__init__.py (2)

Line range hint 24-24: Add return type hints to class methods.

The __new__ method should specify its return type for better type safety and IDE support.

    def __new__(
        cls,
        max_workers: int = 1,
-       backend: str = "local",
+       backend: str = "local",
    ) -> "ExecutorWithDependencies | Any":

Also applies to: 142-142


Line range hint 142-207: Consider optimizing parameter validation.

The __new__ method performs parameter validation only when disable_dependencies=True. Consider moving common parameter validations (like max_workers, cores_per_worker, etc.) before the conditional branch to ensure consistent validation regardless of the dependency setting.

Example refactor:

    def __new__(...):
+       # Common parameter validation
+       _check_max_workers(max_workers)
+       _check_cores(max_cores, cores_per_worker)
+       
        if not disable_dependencies:
            return ExecutorWithDependencies(...)
        else:
            _check_plot_dependency_graph(plot_dependency_graph=plot_dependency_graph)
            _check_refresh_rate(refresh_rate=refresh_rate)
            return create_executor(...)
tests/test_shell_interactive.py (1)

21-21: Consider using text=True instead of universal_newlines=True in subprocess.Popen

In Python 3.7 and above, text=True is preferred over universal_newlines=True for improved clarity.

Apply this diff:

             stdout=subprocess.PIPE,
-            universal_newlines=True,
+            text=True,
             shell=False,
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 8ce73ec and 4e2e033.

📒 Files selected for processing (6)
  • executorlib/init.py (1 hunks)
  • executorlib/shell/init.py (0 hunks)
  • executorlib/shell/executor.py (0 hunks)
  • executorlib/shell/interactive.py (0 hunks)
  • tests/test_shell_executor.py (2 hunks)
  • tests/test_shell_interactive.py (1 hunks)
💤 Files with no reviewable changes (3)
  • executorlib/shell/init.py
  • executorlib/shell/executor.py
  • executorlib/shell/interactive.py
🧰 Additional context used
🪛 Ruff
tests/test_shell_interactive.py

36-39: Combine if branches using logical or operator

Combine if branches

(SIM114)

🔇 Additional comments (8)
tests/test_shell_executor.py (8)

6-8: Import statements updated correctly

The imports have been updated to reflect the removal of SubprocessExecutor and ShellExecutor. The necessary classes and functions are now imported from executorlib and its submodules.


25-31: Test updated to use the generic Executor

The test_execute_single_task function has been updated to use the general Executor class and the new submit_shell_command function. This aligns with the PR objective to unify executor interfaces.


46-54: Correct exception handling for invalid keyword arguments

In test_wrong_error, the test now expects a TypeError when incorrect keyword arguments are provided. This is appropriate, as passing unexpected keyword arguments should raise a TypeError.


68-71: Proper testing of non-existent executables

The test_broken_executable function correctly tests the scenario where an invalid executable path is provided, and it expects a FileNotFoundError. This ensures that the code handles such errors gracefully.


83-89: Consistent use of Executor with shell command submission

In test_shell_static_executor_args, the test appropriately uses the Executor class with the submit_shell_command function. The parameters passed are consistent with the function definition.


95-101: Verification of binary output when universal_newlines=False

The test_shell_static_executor_binary test checks that when universal_newlines is set to False, the output is in bytes. This correctly verifies the behavior of the submit_shell_command function under different settings.


107-110: Inconsistent type usage with command parameter

In test_shell_static_executor_shell, a string is passed to submit_shell_command with shell=True, but the function expects command to be a list. This inconsistency can be resolved by updating the type annotation as previously suggested.


116-128: Efficient parallel task execution using Executor

The test_shell_executor function demonstrates the execution of multiple shell commands in parallel using the Executor class. The implementation is correct and effectively tests the executor's ability to handle concurrent tasks.

SubprocessExecutor,
ShellExecutor,
]
__all__ = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Based on the search results, I can now generate the final response:

Breaking change in public API requires documentation updates

The empty __all__ list removes SubprocessExecutor and ShellExecutor from the public API. The search results show these classes are extensively documented in docs/development.md with examples and usage patterns, but this documentation is now outdated.

  • Update docs/development.md to reflect the removal of SubprocessExecutor and ShellExecutor
  • Add migration guide explaining how to transition from the removed executors to the new unified Executor class
  • Consider adding a deprecation notice in the documentation
🔗 Analysis chain

Document breaking changes in public API.

The empty __all__ list removes previously exported classes (SubprocessExecutor and ShellExecutor) from the public API. This is a breaking change that could affect existing code.

Let's verify the impact:

Consider:

  1. Adding migration guide in documentation
  2. Providing deprecation warnings in previous version
  3. Updating version number to reflect breaking change (major version bump)
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for usage of removed executors in the codebase
echo "Searching for SubprocessExecutor usage..."
rg "SubprocessExecutor"

echo "Searching for ShellExecutor usage..."
rg "ShellExecutor"

Length of output: 2282

Comment on lines +36 to +39
if stop_read_pattern is not None and stop_read_pattern in output_current:
break
elif lines_to_read is not None and lines_to_read == lines_count:
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Simplify conditional statements by combining 'if' branches using logical or operator

Combining the if and elif conditions into a single if statement enhances readability and reduces code complexity.

Apply this diff to combine the conditions:

         while True:
             output_current = process.stdout.readline()
             output += output_current
             lines_count += 1
-            if stop_read_pattern is not None and stop_read_pattern in output_current:
-                break
-            elif lines_to_read is not None and lines_to_read == lines_count:
+            if (stop_read_pattern is not None and stop_read_pattern in output_current) or \
+               (lines_to_read is not None and lines_to_read == lines_count):
                 break
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if stop_read_pattern is not None and stop_read_pattern in output_current:
break
elif lines_to_read is not None and lines_to_read == lines_count:
break
if (stop_read_pattern is not None and stop_read_pattern in output_current) or \
(lines_to_read is not None and lines_to_read == lines_count):
break
🧰 Tools
🪛 Ruff

36-39: Combine if branches using logical or operator

Combine if branches

(SIM114)

Comment on lines +11 to +16
def submit_shell_command(
command: list, universal_newlines: bool = True, shell: bool = False
):
return subprocess.check_output(
command, universal_newlines=universal_newlines, shell=shell
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Update type annotation for command parameter

The submit_shell_command function currently annotates the command parameter as a list, but in some test cases, a string is passed when shell=True. To accommodate both strings and lists as valid input types for command, consider updating the type annotation.

Apply this diff to modify the type annotation:

+from typing import Union, List

-def submit_shell_command(
-    command: list, universal_newlines: bool = True, shell: bool = False
+def submit_shell_command(
+    command: Union[List[str], str], universal_newlines: bool = True, shell: bool = False
 ):
     return subprocess.check_output(
         command, universal_newlines=universal_newlines, shell=shell
     )

This change ensures that the function accepts both list and str types for the command parameter, aligning with how subprocess.check_output handles input when shell=True.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def submit_shell_command(
command: list, universal_newlines: bool = True, shell: bool = False
):
return subprocess.check_output(
command, universal_newlines=universal_newlines, shell=shell
)
from typing import Union, List
def submit_shell_command(
command: Union[List[str], str], universal_newlines: bool = True, shell: bool = False
):
return subprocess.check_output(
command, universal_newlines=universal_newlines, shell=shell
)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (2)
tests/test_shell_executor.py (2)

11-16: Consider adding error handling to the helper function.

The submit_shell_command helper function could benefit from basic error handling to provide more meaningful error messages when command execution fails.

 def submit_shell_command(
     command: list, universal_newlines: bool = True, shell: bool = False
 ):
+    try:
         return subprocess.check_output(
             command, universal_newlines=universal_newlines, shell=shell
         )
+    except subprocess.CalledProcessError as e:
+        raise RuntimeError(f"Command failed with exit code {e.returncode}: {e.output}")

Line range hint 1-146: Well-structured transition to unified executor interface.

The test suite successfully demonstrates how the functionality previously provided by SubprocessExecutor can be achieved using the generic Executor class. The changes maintain test coverage while simplifying the overall architecture.

A few suggestions to consider:

  1. Add docstrings to document the new execution patterns
  2. Consider creating shared test utilities for common executor setup code
  3. Add test cases for concurrent execution with different worker counts
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 4e2e033 and 6a17373.

📒 Files selected for processing (1)
  • tests/test_shell_executor.py (2 hunks)
🔇 Additional comments (3)
tests/test_shell_executor.py (3)

6-8: LGTM: Import changes align with PR objectives.

The transition from specific executor imports to the generic Executor and related utilities aligns well with the PR's goal of unifying the executor interface.


55-61: LGTM: Improved error handling.

The change from KeyError to TypeError for invalid kwargs is more appropriate as it better reflects the nature of the error (type mismatch rather than missing key).


34-39: Verify parallel execution configuration.

The transition to execute_parallel_tasks introduces new parameters. Ensure that cores=1 with openmpi_oversubscribe=False provides equivalent behavior to the previous sequential execution.

✅ Verification successful

The parallel execution configuration is correctly implemented.

The code review concern can be resolved. The execute_parallel_tasks function is consistently used across test files with cores=1 and openmpi_oversubscribe=False for sequential execution testing. This configuration ensures:

  1. Single core execution (cores=1) which maintains sequential behavior
  2. No oversubscription, preventing resource conflicts
  3. Consistent usage pattern across all test files (test_shell_executor.py, test_shell_interactive.py, test_local_executor.py)

The implementation in the executor.py shows this is the intended way to run sequential tasks through the parallel execution framework.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for other test files using similar configuration to verify consistency
rg -l 'execute_parallel_tasks.*cores.*openmpi_oversubscribe'

Length of output: 62


Script:

#!/bin/bash
# Let's search for all occurrences of execute_parallel_tasks to understand its usage
rg -A 5 "execute_parallel_tasks"

# Also search for the function definition to understand expected parameters
ast-grep --pattern 'def execute_parallel_tasks($$$)'

Length of output: 7108


Script:

#!/bin/bash
# Let's look at the function definition and its parameters in more detail
cat executorlib/shared/executor.py

Length of output: 28333

Comment on lines +119 to +132
with Executor(max_workers=2) as exe:
cloudpickle_register(ind=1)
f_1 = exe.submit(
submit_shell_command, ["echo", "test_1"], universal_newlines=True
)
f_2 = exe.submit(
submit_shell_command, ["echo", "test_2"], universal_newlines=True
)
f_3 = exe.submit(
submit_shell_command, ["echo", "test_3"], universal_newlines=True
)
f_4 = exe.submit(
submit_shell_command, ["echo", "test_4"], universal_newlines=True
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider parameterizing the parallel execution test.

The parallel execution test could be made more maintainable using parameterized testing. This would make it easier to add more test cases and reduce code duplication.

@pytest.mark.parametrize("test_input,expected", [
    ("test_1", "test_1\n"),
    ("test_2", "test_2\n"),
    ("test_3", "test_3\n"),
    ("test_4", "test_4\n"),
])
def test_shell_executor_parameterized(test_input, expected):
    with Executor(max_workers=2) as exe:
        cloudpickle_register(ind=1)
        future = exe.submit(
            submit_shell_command, 
            ["echo", test_input], 
            universal_newlines=True
        )
        assert not future.done()
        assert future.result() == expected
        assert future.done()

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)
tests/test_shell_interactive.py (1)

48-102: Consider adding error case tests.

The test_execute_single_task method effectively tests the happy path, but consider adding tests for error scenarios such as:

  • Invalid input
  • Process termination before shutdown
  • Pattern not found
  • Zero or negative line count

Would you like me to help generate additional test cases for these scenarios?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 6a17373 and a3c8eac.

📒 Files selected for processing (1)
  • tests/test_shell_interactive.py (1 hunks)
🧰 Additional context used
🪛 Ruff
tests/test_shell_interactive.py

36-39: Combine if branches using logical or operator

Combine if branches

(SIM114)

🔇 Additional comments (4)
tests/test_shell_interactive.py (4)

1-13: LGTM! Import changes align with PR objectives.

The imports have been correctly updated to reflect the transition from specialized executors to the generic Executor class.


15-24: LGTM! Well-structured process initialization.

The init_process function correctly configures the subprocess with appropriate parameters and returns the expected dictionary format.


43-46: LGTM! Clean shutdown implementation.

The shutdown function properly handles process termination by writing and flushing the shutdown command.


104-127: LGTM! Well-structured executor context usage.

The test demonstrates proper usage of the Executor context manager and appropriate verification of task completion. The shutdown handling ensures clean test cleanup.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)
tests/test_shell_executor.py (1)

85-92: LGTM! Comprehensive coverage of execution modes.

The tests effectively demonstrate how the general Executor class replaces ShellExecutor functionality across different execution modes (args, binary, shell).

Consider adding docstrings to explain the purpose of each test case and what specific ShellExecutor functionality it replaces.

Also applies to: 98-105, 111-115

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between a3c8eac and 905953c.

📒 Files selected for processing (1)
  • tests/test_shell_executor.py (2 hunks)
🔇 Additional comments (5)
tests/test_shell_executor.py (5)

6-8: LGTM! Import changes align with PR objective.

The changes correctly replace SubprocessExecutor imports with the general Executor class and related utilities, which aligns with the PR's goal of unifying the executor interface.


77-82: LGTM! Error handling test properly adapted.

The test maintains its purpose of verifying FileNotFoundError while properly adapting to the new execution framework.


121-134: LGTM! Effective demonstration of parallel execution.

The test successfully demonstrates how the general Executor handles parallel task execution, properly replacing the parallel execution capabilities of the removed executors.


56-62: Verify error type change alignment with Executor implementation.

The test now expects TypeError instead of KeyError for invalid arguments. Let's verify this aligns with the new Executor implementation.

#!/bin/bash
# Search for error handling in Executor implementation
rg -A 5 "raise TypeError" 
rg -A 5 "raise KeyError"

34-39: Consider using a simpler execution method for single task test.

The test is using execute_parallel_tasks with MPI configuration for a single task execution, which might be overengineered. Consider if a simpler execution method would be more appropriate for this test case.

@jan-janssen jan-janssen merged commit 7273a9f into main Oct 27, 2024
24 checks passed
@jan-janssen jan-janssen deleted the remove_shell branch October 27, 2024 13:13
This was referenced Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant