Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

maint: Remove use of folly/ThreadCachedInt #1486

Merged
merged 2 commits into from
May 14, 2024

Conversation

ianthomas23
Copy link
Collaborator

Reference Issues/PRs

Fixes the 10th item of the folly replacement plan, issue #1412.

What does this implement or fix?

This removes the single use of folly/ThreadCachedInt. It is replaced by a partial vendoring of the folly code plus use of boost::thread_specific_ptr.

ThreadCachedInt is used to count the number of freed memory blocks. It is (presumably) not just implemented as an atomic integer count as thread locking would be too slow, so instead each thread has its own count and when a single thread's count exceeds some threshold it is added to the overall count. The original folly implementation has two ways of reading the count which are slow (fully accurate) and fast (not fully accurate). ArcticDB only uses the fast option, so the implementation is much simpler than folly's, requiring fewer atomics.

New class ThreadCachedInt in thread_cached_int.hpp is derived from https://github.com/facebook/folly/blob/main/folly/ThreadCachedInt.h but containing only the subset of functionality required. Instead of using folly's own ThreadLocalPtr this uses boost::thread_specific_ptr. The folly source code claims that their implementation is 4x faster than the boost one:

https://github.com/facebook/folly/blob/dbc9e565f54eabb40ad6600656ad9dea919f51c0/folly/ThreadLocal.h#L18-L20

but that claim dates from 12 years ago and this solution is simpler than theirs. This does need to be benchmarked against master to confirm that it is not measurably slower.

Any other comments?

The only place this is ultimately used is to control when malloc_trim is called here

#if defined(__linux__) && defined(__GLIBC__)
malloc_trim(0);
#endif

to release memory back to the system. This only occurs on Linux. Other OSes could have all of this code removed but this would be a bigger change with many #ifdef blocks, etc.

@ianthomas23 ianthomas23 mentioned this pull request Apr 10, 2024
17 tasks
@ianthomas23 ianthomas23 force-pushed the ianthomas23/thread_cached_int branch 4 times, most recently from 1cbd0ea to f1c48ea Compare April 17, 2024 11:57
Copy link
Collaborator

@jjerphan jjerphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you, @ianthomas23.

I would refer to @willdealtry's final decision, especially for potential regressions which might not be caught by the ASV benchmark suite.

namespace arcticdb {

template <class IntT>
class ThreadCachedInt {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation LGTM!

I think it might be useful to mention that this implementation is a modification of one of folly, as you mentioned in the description of this PR. There's no problem with it since folly is Apache-2.0 licensed.

What do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.

@ianthomas23
Copy link
Collaborator Author

I have added extra clarification in the source code that this is derived from folly's implementation.

@ianthomas23 ianthomas23 force-pushed the ianthomas23/thread_cached_int branch from ad1fc80 to 191545b Compare May 6, 2024 17:53
@ianthomas23 ianthomas23 merged commit 9a4fadc into master May 14, 2024
114 checks passed
@ianthomas23 ianthomas23 deleted the ianthomas23/thread_cached_int branch May 14, 2024 07:44
grusev pushed a commit that referenced this pull request Nov 25, 2024
#### Reference Issues/PRs

Fixes the 10th item of the folly replacement plan, issue #1412.

#### What does this implement or fix?

This removes the single use of `folly/ThreadCachedInt`. It is replaced
by a partial vendoring of the `folly` code plus use of
`boost::thread_specific_ptr`.

`ThreadCachedInt` is used to count the number of freed memory blocks. It
is (presumably) not just implemented as an atomic integer count as
thread locking would be too slow, so instead each thread has its own
count and when a single thread's count exceeds some threshold it is
added to the overall count. The original `folly` implementation has two
ways of reading the count which are slow (fully accurate) and fast (not
fully accurate). ArcticDB only uses the fast option, so the
implementation is much simpler than `folly`'s, requiring fewer
`atomic`s.

New class `ThreadCachedInt` in `thread_cached_int.hpp` is derived from
https://github.com/facebook/folly/blob/main/folly/ThreadCachedInt.h but
containing only the subset of functionality required. Instead of using
`folly'`s own `ThreadLocalPtr` this uses `boost::thread_specific_ptr`.
The `folly` source code claims that their implementation is 4x faster
than the `boost` one:


https://github.com/facebook/folly/blob/dbc9e565f54eabb40ad6600656ad9dea919f51c0/folly/ThreadLocal.h#L18-L20

but that claim dates from 12 years ago and this solution is simpler than
theirs. This does need to be benchmarked against `master` to confirm
that it is not measurably slower.

#### Any other comments?

The only place this is ultimately used is to control when `malloc_trim`
is called here


https://github.com/man-group/ArcticDB/blob/e3fab24b653439f9894495a2657bb2dcfc1fbb42/cpp/arcticdb/util/allocator.cpp#L286-L288

to release memory back to the system. This only occurs on Linux. Other
OSes could have all of this code removed but this would be a bigger
change with many `#ifdef` blocks, etc.

---------

Signed-off-by: Ian Thomas <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants