-
Notifications
You must be signed in to change notification settings - Fork 973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BucketListDB Bucket Apply Optimization #4114
Conversation
4e16098
to
cf799df
Compare
cf799df
to
d7adda1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious to see catchup application time after this change!
src/catchup/ApplyBucketsWork.h
Outdated
@@ -49,6 +51,10 @@ class ApplyBucketsWork : public Work | |||
|
|||
bool mDelayChecked{false}; | |||
|
|||
static uint32_t startingLevel(bool offersOnly); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit on naming: could we rename this parameter to BucketListDBEnabled
or something like that? Replaying only offers is basically an implementation detail of BucketListDB, that callers don't need to know about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I think you don't need to pass offersOnly parameter, both ApplyBucketsWork and BucketApplicator have access to App's config to decide if BL should be applied in reverse.
src/bucket/BucketApplicator.h
Outdated
@@ -28,6 +28,10 @@ class BucketApplicator | |||
BucketInputIterator mBucketIter; | |||
size_t mCount{0}; | |||
std::function<bool(LedgerEntryType)> mEntryTypeFilter; | |||
bool const mNewOffersOnly; | |||
UnorderedSet<LedgerKey>& mSeenKeys; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you need randomized hashing of UnorderedSet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. This is just storing seen offer keys, so I think any kind of attack triggering lots of hash collisions would have to be very sophisticated and pretty worthless, as the result would be a marginally slowed catch up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so.. how about std::unordered_set then? :)
src/catchup/ApplyBucketsWork.cpp
Outdated
} | ||
else | ||
{ | ||
mFinishedIndexBucketsWork = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for mFinishedIndexBucketsWork
or mSpawnedIndexBucketsWork
: just use the pointer returned by addWork to determine if work started/finished.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(same applies to mSpawnedAssumeStateWork
; would be nice to clean that up too, but no worries if not, since it's outside of the scope of this PR)
// The current size of this set is 1.6 million during BucketApply | ||
// (as of 12/20/23). There's not a great way to estimate this, so | ||
// reserving with some extra wiggle room | ||
mSeenKeys.reserve(2'000'000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thoughts on expected memory consumption here? (keys are pretty small, but would be good to understand why we saw OOM kills in CI with this change)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's pretty variable and correlated to high DEX activity rather than BucketList size. Currently, it takes about 180 mb. I also tested a period of very high DEX activity in 2018 which consumed 1.15 GB. For replaying all history, I would estimate the upper bound is around 1.2 GB (I found an expensive period, but I'm not sure if it's the absolute max).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the peak memory usage of core on any given range is 3.8 GB during catchup (total, not just the offer apply index).
mFirstBucket.reset(); | ||
mSecondBucket.reset(); | ||
mFirstBucketApplicator.reset(); | ||
mSecondBucketApplicator.reset(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mFinishedIndexBucketsWork
and mSpawnedIndexBucketsWork
aren't reset here; assuming those will go away, you'd need to reset Work pointer instead.
src/catchup/ApplyBucketsWork.cpp
Outdated
if (!mSpawnedAssumeStateWork) | ||
// This status string only applies to step 2 when we actually apply the | ||
// buckets. | ||
if (mFinishedIndexBucketsWork && !mSpawnedAssumeStateWork) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we log something meaningful about indexing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to, as IndexBucketsWork
already prints logs regarding indexing progress.
src/catchup/ApplyBucketsWork.cpp
Outdated
@@ -174,31 +230,54 @@ ApplyBucketsWork::startLevel() | |||
auto& level = getBucketLevel(mLevel); | |||
HistoryStateBucket const& i = mApplyState.currentBuckets.at(mLevel); | |||
|
|||
bool applySnap = (i.snap != binToHex(level.getSnap()->getHash())); | |||
bool applyCurr = (i.curr != binToHex(level.getCurr()->getHash())); | |||
bool applyFirst = mOffersOnly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all the new ternary operators are getting a bit hard to follow; that aside, this function applying the whole level has always been kind of strange. like, is there actually a reason to apply the whole level and keep track of snap & curr? Alternatively, we could just have an iterator of buckets in the correct order of application, and doWork
could pick up one bucket at a time to process. This way I think we don't need to track snap and curr at all. This does require quite a bit of refactoring though, so maybe we can at least open an issue to clean this up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree that this is worth doing but should happen post merge. I've opened an issue for it here #4290.
d7adda1
to
574f14f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the updates, LGTM! Just a minor code cleanup request. Please squash commits also.
if (!mSpawnedAssumeStateWork) | ||
// Step 1: index buckets. Step 2: apply buckets. Step 3: assume state | ||
bool isUsingBucketListDB = mApp.getConfig().isUsingBucketListDB(); | ||
if (isUsingBucketListDB) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'd prefer if we could write this flow such that it's consistent with Work usage across the codebase:
if (!mIndexBucketsWork)
{
// Spawn indexing work for the first time
mIndexBucketsWork = addWork<IndexBucketsWork>(mBucketsToIndex);
return State::WORK_RUNNING;
}
else if (mIndexBucketsWork->getState() != BasicWork::State::WORK_SUCCESS)
{
// Exit early if indexing work is still running, or failed
return mIndexBucketsWork->getState();
}
// Otherwise, continue with next steps
src/catchup/ApplyBucketsWork.cpp
Outdated
// If indexing has not finished or finished and failed, return result | ||
// status. If indexing finished and succeeded, move on and spawn assume | ||
// state work. | ||
auto status = checkChildrenStatus(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't use checkChildrenStatus
to determine the state of mIndexBucketsWork
, as this parent work might have other children, which would cause the function to return unexpected results. To query state of each individual work, use that work's getState
method.
574f14f
to
004a4ba
Compare
Done! |
r+ 004a4ba |
Description
Resolves #4113
This change improves Bucket apply times by approximately 80% when BucketListDB is enabled. In the current catchup flow, we first apply buckets then index those buckets. The Bucket apply step scans the entire BucketList even though only offers need to be applied. This PR changes the ordering such that we index buckets then apply them. This allows us to use the index to only scan sections of the BucketList that contain offers.
Bucket apply currently writes every entry it sees to the SQL DB. Because the BucketList is implemented via a LSMT, many versions of a given entry may exist in the BucketList, with only the most recent version being valid. This means Bucket apply must iterate through the BucketList in reverse order such that the last version of a key written is the most recent. This causes significant write amplification, as many versions of the same entry are written to disk even though only the most recent version is valid.
When BucketListDB is enabled, this change now applies buckets in-order so that we must only write any key a single time. However, to avoid writing out of date entries to the DB, in-order application requires that every key written to the DB is recorded in an in-memory set. This set would be too large if all entry types are being applied. However, when BucketListDB is enabled, we must only apply offer entries, making this optimization possible.
Checklist
clang-format
v8.0.0 (viamake format
or the Visual Studio extension)