-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
maint: Folly Replacement Plan #1412
Labels
Comments
5 tasks
This was referenced Mar 20, 2024
ianthomas23
added a commit
that referenced
this issue
Mar 21, 2024
#### Reference Issues/PRs Fixes one item of the folly replacement plan, issue #1412. #### What does this implement or fix? This removes use of `folly/system/ThreadName.h`. It was used in two places to obtain a `std::string` name of a thread, and here is replaced with conversion of unique thread ID to a string using `fmt` via ```c++ fmt::format("{}", arcticdb::get_thread_id()) ``` #### Any other comments? This is used in two places, the first in the termination handler in `python_module.cpp` which I have manually tested. The second use is in the [Remotery](https://github.com/Celtoys/Remotery) which, as far as I can tell, is not tested in CI so I am not sure of the status of the Remotery support. Signed-off-by: Ian Thomas <[email protected]>
ianthomas23
added a commit
that referenced
this issue
May 14, 2024
#### Reference Issues/PRs Fixes the 10th item of the folly replacement plan, issue #1412. #### What does this implement or fix? This removes the single use of `folly/ThreadCachedInt`. It is replaced by a partial vendoring of the `folly` code plus use of `boost::thread_specific_ptr`. `ThreadCachedInt` is used to count the number of freed memory blocks. It is (presumably) not just implemented as an atomic integer count as thread locking would be too slow, so instead each thread has its own count and when a single thread's count exceeds some threshold it is added to the overall count. The original `folly` implementation has two ways of reading the count which are slow (fully accurate) and fast (not fully accurate). ArcticDB only uses the fast option, so the implementation is much simpler than `folly`'s, requiring fewer `atomic`s. New class `ThreadCachedInt` in `thread_cached_int.hpp` is derived from https://github.com/facebook/folly/blob/main/folly/ThreadCachedInt.h but containing only the subset of functionality required. Instead of using `folly'`s own `ThreadLocalPtr` this uses `boost::thread_specific_ptr`. The `folly` source code claims that their implementation is 4x faster than the `boost` one: https://github.com/facebook/folly/blob/dbc9e565f54eabb40ad6600656ad9dea919f51c0/folly/ThreadLocal.h#L18-L20 but that claim dates from 12 years ago and this solution is simpler than theirs. This does need to be benchmarked against `master` to confirm that it is not measurably slower. #### Any other comments? The only place this is ultimately used is to control when `malloc_trim` is called here https://github.com/man-group/ArcticDB/blob/23b3b943b7c4a10889a563a063b2a616fe00d9fa/cpp/arcticdb/util/allocator.cpp#L286-L288 to release memory back to the system. This only occurs on Linux. Other OSes could have all of this code removed but this would be a bigger change with many `#ifdef` blocks, etc. --------- Signed-off-by: Ian Thomas <[email protected]>
This was referenced May 17, 2024
This was referenced Aug 27, 2024
grusev
pushed a commit
that referenced
this issue
Nov 25, 2024
#### Reference Issues/PRs Fixes one item of the folly replacement plan, issue #1412. #### What does this implement or fix? This removes use of `folly/system/ThreadName.h`. It was used in two places to obtain a `std::string` name of a thread, and here is replaced with conversion of unique thread ID to a string using `fmt` via ```c++ fmt::format("{}", arcticdb::get_thread_id()) ``` #### Any other comments? This is used in two places, the first in the termination handler in `python_module.cpp` which I have manually tested. The second use is in the [Remotery](https://github.com/Celtoys/Remotery) which, as far as I can tell, is not tested in CI so I am not sure of the status of the Remotery support. Signed-off-by: Ian Thomas <[email protected]>
grusev
pushed a commit
that referenced
this issue
Nov 25, 2024
#### Reference Issues/PRs Fixes the 10th item of the folly replacement plan, issue #1412. #### What does this implement or fix? This removes the single use of `folly/ThreadCachedInt`. It is replaced by a partial vendoring of the `folly` code plus use of `boost::thread_specific_ptr`. `ThreadCachedInt` is used to count the number of freed memory blocks. It is (presumably) not just implemented as an atomic integer count as thread locking would be too slow, so instead each thread has its own count and when a single thread's count exceeds some threshold it is added to the overall count. The original `folly` implementation has two ways of reading the count which are slow (fully accurate) and fast (not fully accurate). ArcticDB only uses the fast option, so the implementation is much simpler than `folly`'s, requiring fewer `atomic`s. New class `ThreadCachedInt` in `thread_cached_int.hpp` is derived from https://github.com/facebook/folly/blob/main/folly/ThreadCachedInt.h but containing only the subset of functionality required. Instead of using `folly'`s own `ThreadLocalPtr` this uses `boost::thread_specific_ptr`. The `folly` source code claims that their implementation is 4x faster than the `boost` one: https://github.com/facebook/folly/blob/dbc9e565f54eabb40ad6600656ad9dea919f51c0/folly/ThreadLocal.h#L18-L20 but that claim dates from 12 years ago and this solution is simpler than theirs. This does need to be benchmarked against `master` to confirm that it is not measurably slower. #### Any other comments? The only place this is ultimately used is to control when `malloc_trim` is called here https://github.com/man-group/ArcticDB/blob/e3fab24b653439f9894495a2657bb2dcfc1fbb42/cpp/arcticdb/util/allocator.cpp#L286-L288 to release memory back to the system. This only occurs on Linux. Other OSes could have all of this code removed but this would be a bigger change with many `#ifdef` blocks, etc. --------- Signed-off-by: Ian Thomas <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Motivation
Folly is admittedly:
This makes ArcticDB hardly portable, packageable on many platforms under a shared library on conda-forge.
Test disablement
Tests are disabled for Windows for elements ArcticDB uses including but not limited to:
F14Set
ConcurrentHashMap
ThreadPoolExecutors
Future
ThreadName
FBString
Hash
ThreadCachedInt
ThreadLocal
folly's dependency graph
WIP Removal plan
Some elements have been removed with:
Based on the current usage on
master
as of 4184a46.Smaller utilities
folly/gen/Base
(12 includes): We also might need to rewrite that IMO.folly/Range
(16 includes): Elements ofstd::ranges
as of C++23? e.g.std::ranges::iota_view
folly/Poly
(6 includes): Will be rewritten (part of the SOW)folly/ClockGettimeWrappers
(1 include): Fixed by maint: Replace usage of folly inclock.hpp
#1448folly/hash/Hash
(1 include): Candidate fix maint: Remove use of folly/hash/Hash.h #1506folly/portability/PThread
(1 include): Fixed by maint: Remove use offolly/portability/PThread.h
#1447folly/portability/Time
(1 include): Fixed by maint: Replace usage of folly inclock.hpp
#1448folly/system/ThreadId
(1 include): Fixed by maint: Replace use of folly getCurrentThreadId with STL #1417folly/system/ThreadName
(2 includes): Fixed by maint: Remove use of folly/system/ThreadName.h #1446folly/ThreadCachedInt
(1 include): Fixed by maint: Remove use of folly/ThreadCachedInt #1486folly/concurrency/ConcurrentHashMap
(4 include): Fixed by maint: Replacefolly::ConcurrentHashMap
withboost::concurrent_flat_map
#1580Task scheduling system
folly/executors/CPUThreadPoolExecutor
(4 includes)folly/executors/FutureExecutor
(4 includes)folly/executors/IOThreadPoolExecutor
(4 includes)folly/Function
(9 includes)folly/futures/Future
(23 includes)folly/futures/FutureSplitter
(2 includes)Potential alternatives:
taskflow
: Async Parallel Task for Modern C++ (mentionned by Thorsten)C++17
libunifex
: Mentioned by Joëlfolly
but different teamstd::execution
which is developped by the same team and is targetted forC++26
C++17
or later, C++ coroutines support if usingC++20
or lateroneTBB
):pthread
)nanothread
:The text was updated successfully, but these errors were encountered: