Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Internal][Executor] Refine the LineExecutionProcessPool to align more closely with a standard process pool #2234

Merged
merged 56 commits into from
Mar 12, 2024

Conversation

PeiwenGaoMS
Copy link
Contributor

@PeiwenGaoMS PeiwenGaoMS commented Mar 6, 2024

Description

Background

After splitting the executor and runtime into two separate containers, the implementation of batch run in the executor server also requires a process pool to handle execution requests line by line. Therefore, we need the line process pool to have the capability to process individual lines. As a process pool, it should also have the ability to submit one or multiple tasks at a time, consistent with the python process pool interface.

Usages

  • Use with context
with LineExecutionProcessPool(...) as pool:
      line_results = await pool.run(zip(line_number, batch_inputs))
  • Use method
pool = LineExecutionProcessPool(...)
pool.start()
line_results = await pool.run(zip(line_number, batch_inputs))
pool.close()

Public Functions

  • start: Create task queue, input/output queues, and start processes and monitor thread pool.
  • close: Send terminate signal to monitor threads, end processes and close thread pool.
  • run: Put all line inputs to task queue and get the line results list.
  • submit: Put one line input to task queue and get one line result.

Implementation Process

image

Main Differences from Previous Implementation

  • Start the monitor threads at the beginning instead of doing it in the run method.
  • The monitor thread will not exit due to an empty task queue. It exits only when the batch run times out or when a termination signal is received from the task queue.
  • Update run to async function.

Code Changes Sunmary

This pull request includes changes to various parts of the promptflow system, with the main focus being the addition of new functionalities and the refactoring of existing code for better performance and readability. The most significant changes include the addition of a new function to convert multimedia data to string, the introduction of a method to determine the maximum number of workers that can be created, the conversion of some functions to asynchronous, and the implementation of new exception classes.

New functionalities:

Refactoring:

  • src/promptflow/promptflow/batch/_batch_engine.py, src/promptflow/promptflow/batch/_python_executor_proxy.py: Converted some functions to asynchronous for better performance. [1] [2] [3]
  • src/promptflow/promptflow/executor/_process_manager.py: Refactored the AbstractProcessManager and SpawnProcessManager classes, added new methods for process management, and made changes to improve readability. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]
  • src/promptflow/tests/executor/e2etests/test_batch_timeout.py, src/promptflow/tests/executor/unittests/_utils/test_process_utils.py, src/promptflow/tests/executor/unittests/executor/test_line_execution_process_pool.py: Updated test cases to reflect the changes made in the codebase. [1] [2] [3] [4]

Additions:

All Promptflow Contribution checklist:

  • The pull request does not introduce [breaking changes].
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.
  • Create an issue and link to the pull request to get dedicated review from promptflow team. Learn more: suggested workflow.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

@PeiwenGaoMS PeiwenGaoMS requested review from a team as code owners March 6, 2024 11:04
@github-actions github-actions bot added promptflow executor The changes related to the execution of the flow labels Mar 6, 2024
Copy link

github-actions bot commented Mar 6, 2024

SDK CLI Global Config Test Result devs/peiwen/refactor_line_process_pool

2 tests   2 ✅  45s ⏱️
1 suites  0 💤
1 files    0 ❌

Results for commit 22dd51e.

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Mar 6, 2024

Executor Unit Test Result devs/peiwen/refactor_line_process_pool

754 tests   754 ✅  3m 56s ⏱️
  1 suites    0 💤
  1 files      0 ❌

Results for commit 22dd51e.

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Mar 6, 2024

promptflow SDK CLI Azure E2E Test Result devs/peiwen/refactor_line_process_pool

  4 files    4 suites   3m 47s ⏱️
168 tests 149 ✅ 19 💤 0 ❌
672 runs  596 ✅ 76 💤 0 ❌

Results for commit 22dd51e.

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Mar 6, 2024

Executor E2E Test Result devs/peiwen/refactor_line_process_pool

210 tests   208 ✅  8m 11s ⏱️
  1 suites    2 💤
  1 files      0 ❌

Results for commit 22dd51e.

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Mar 6, 2024

SDK CLI Test Result devs/peiwen/refactor_line_process_pool

   12 files     12 suites   40m 29s ⏱️
  451 tests   433 ✅ 18 💤 0 ❌
1 804 runs  1 732 ✅ 72 💤 0 ❌

Results for commit 22dd51e.

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Mar 11, 2024

promptflow-tracing e2e test result devs/peiwen/refactor_line_process_pool

 4 files   4 suites   1m 36s ⏱️
 8 tests  8 ✅ 0 💤 0 ❌
32 runs  32 ✅ 0 💤 0 ❌

Results for commit 22dd51e.

♻️ This comment has been updated with latest results.

@PeiwenGaoMS PeiwenGaoMS merged commit bf96c0b into main Mar 12, 2024
57 checks passed
@PeiwenGaoMS PeiwenGaoMS deleted the devs/peiwen/refactor_line_process_pool branch March 12, 2024 05:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
executor The changes related to the execution of the flow promptflow
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants