Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update vendored thread_pool implementation #16210

Merged
merged 9 commits into from
Jul 19, 2024

Conversation

wence-
Copy link
Contributor

@wence- wence- commented Jul 8, 2024

Description

Since we introduced the vendored thread_pool in #8752, upstream has introduced some new features, and particularly now uses condition variables/notification to handle when there are no tasks in the queue. This avoids the issue described in #16209 where the thread pool by default artificially introduces a delay of 1000microseconds to all tasks whenever the task queue is emptied.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@wence- wence- requested a review from a team as a code owner July 8, 2024 10:14
@wence- wence- requested review from harrism and vuule July 8, 2024 10:14
@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Jul 8, 2024
@wence- wence- added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change libcudf Affects libcudf (C++/CUDA) code. and removed libcudf Affects libcudf (C++/CUDA) code. labels Jul 8, 2024
@wence-
Copy link
Contributor Author

wence- commented Jul 8, 2024

I culled all the compile-time optional features from the vendored implementation, one could reinstate them if wanted.

Marked as non-breaking, even though the interface to the thread pool changed, since it's a detail API.

This change should be benchmarked before merging.

@vuule
Copy link
Contributor

vuule commented Jul 8, 2024

@madsbk does kvikIO need a similar change to its thread pool?

@wence-
Copy link
Contributor Author

wence- commented Jul 9, 2024

@madsbk does kvikIO need a similar change to its thread pool?

Just to note, if one does this, we should consider if we also want to keep the other features (e.g. using a priority queue for the tasks rather than a FIFO queue)

@madsbk
Copy link
Member

madsbk commented Jul 9, 2024

@madsbk does kvikIO need a similar change to its thread pool?

Just to note, if one does this, we should consider if we also want to keep the other features (e.g. using a priority queue for the tasks rather than a FIFO queue)

Agree, let's do both. One PR to update to the new version of the thread pool and another PR to use/benchmark the new features.

@vyasr
Copy link
Contributor

vyasr commented Jul 9, 2024

Should we just fetch this using rapids-cmake instead of vendoring? It seems simpler, and they document how to do it with CPM (which rapids-cmake uses, so we'd do basically the same thing).

@srinivasyadav18
Copy link
Contributor

I would agree with @vyasr on previous comment, as it will be easier to maintain and also fetch future versions easily using git tags instead of manually updating.

@wence- wence- force-pushed the wence/fea/16209 branch from e5e33b6 to 23e8aaa Compare July 10, 2024 09:10
@wence- wence- requested a review from a team as a code owner July 10, 2024 09:10
@github-actions github-actions bot added the CMake CMake build issue label Jul 10, 2024
@wence-
Copy link
Contributor Author

wence- commented Jul 10, 2024

Should we just fetch this using rapids-cmake instead of vendoring? It seems simpler, and they document how to do it with CPM (which rapids-cmake uses, so we'd do basically the same thing).

OK, I did this, probably.

@GregoryKimball
Copy link
Contributor

@wence- would you please pull in the changes from branch-24.08 to add the new benchmark in #16237? We need to check the new runtime ASAP to see if there is a clean throughput signal from pipelining

cpp/CMakeLists.txt Outdated Show resolved Hide resolved
@wence-
Copy link
Contributor Author

wence- commented Jul 10, 2024

@wence- would you please pull in the changes from branch-24.08 to add the new benchmark in #16237? We need to check the new runtime ASAP to see if there is a clean throughput signal from pipelining

Done, I hope.

for (size_t i = 0; i < num_files; ++i) {
threads.submit(read_func, i);
}
threads.pause();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pause and unpause methods are only enabled when this macro BS_THREAD_POOL_ENABLE_PAUSE is explicitly defined. Reference link here .

Copy link
Contributor

@srinivasyadav18 srinivasyadav18 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor changes to enable pause and unpause methods. Otherwise LGTM!

cpp/cmake/thirdparty/get_thread_pool.cmake Outdated Show resolved Hide resolved
cpp/benchmarks/groupby/group_max_multithreaded.cpp Outdated Show resolved Hide resolved
Co-authored-by: Srinivas Yadav <[email protected]>
@@ -798,13 +800,16 @@ target_compile_definitions(cudf PRIVATE "RMM_LOGGING_LEVEL=LIBCUDF_LOGGING_LEVEL
# Define spdlog level
target_compile_definitions(cudf PUBLIC "SPDLOG_ACTIVE_LEVEL=SPDLOG_LEVEL_${LIBCUDF_LOGGING_LEVEL}")

# Enable pause functions in thread_pool implementation
target_compile_definitions(cudf PRIVATE "BS_THREAD_POOL_ENABLE_PAUSE=1")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should move this off cudf and onto the BS_thread_pool target in get_thread_pool.cmake as an INTERFACE compile definition.

target_compile_definitions(BS_thread_pool INTERFACE "BS_THREAD_POOL_ENABLE_PAUSE=1")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done.

Co-authored-by: Robert Maynard <[email protected]>
@wence-
Copy link
Contributor Author

wence- commented Jul 12, 2024

@wence- would you please pull in the changes from branch-24.08 to add the new benchmark in #16237? We need to check the new runtime ASAP to see if there is a clean throughput signal from pipelining

@GregoryKimball before we merge this, can we confirm that benchmarking doesn't present a negative signal?

@wence-
Copy link
Contributor Author

wence- commented Jul 19, 2024

/merge

@rapids-bot rapids-bot bot merged commit debbef0 into rapidsai:branch-24.08 Jul 19, 2024
80 checks passed
@wence- wence- deleted the wence/fea/16209 branch July 19, 2024 14:13
rapids-bot bot pushed a commit to rapidsai/kvikio that referenced this pull request Jul 24, 2024
Instead of hard coding the header of [`BS::thread_pool`](https://github.com/bshoshany/thread-pool?tab=readme-ov-file), we now fetch the implementation. Done for cudf in rapidsai/cudf#16210.

Authors:
  - Mads R. B. Kristensen (https://github.com/madsbk)
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #408
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake CMake build issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Update vendored thread-pool implementation
8 participants