-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Confirm badgerMove
entry required before rewrite
#1302
Confirm badgerMove
entry required before rewrite
#1302
Conversation
Co-Authored-By: Connor Hindley <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @ou05020 . I'll get this PR reviewed by @manishrjain
Reviewable status: 0 of 1 files reviewed, 1 unresolved discussion (waiting on @ashish-goswami, @jarifibrahim, @manishrjain, and @ou05020)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be expensive. Instead, we should check why the current mechanisms of movekey removal don't work.
Reviewable status: 0 of 1 files reviewed, 1 unresolved discussion (waiting on @ashish-goswami, @manishrjain, and @ou05020)
@manishrjain However, the discard calculation includes movekey entry size in the total number of bytes (not the discard bytes) in the discard ratio calculation even if that movekey is for a key that has been deleted. If your discard ratio is 50% and half of a file is used by movekey entries for deleted keys, it will never satisfy the discard ratio allowing it to be rewritten and subsequent movekey deletion. There are only two approaches I can think of to solve the problem. Always check if underlying key for each movekey has been deleted to determine if an entry should be discarded. Or, automatically write a delete I added db_test.go If you update |
The Line 685 in 62b7a10
because the current implementation of iterators does not return duplicates (keys with same version).
After rewriting value log file
The badger/table/merge_iterator.go Lines 153 to 159 in 62b7a10
The iterator will see only the vlog;y key because it is at the top.
Even if the iterator were to not skip this key, we will end up losing data if
Compaction would drop all the keys except the one with the delete marker and a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @ashish-goswami and @ou05020)
Thanks for fixing this @ou05020 🎉 |
Value log badgerMove entries for deleted keys are always rewritten. This leads to exponential DB size growth in heavy write/delete use cases with value pointer entries. To prevent this, a check has been added to confirm the value is still needed for badgerMove entries. (cherry picked from commit 3e1cdd9)
) (#1502) * Confirm `badgerMove` entry required before rewrite (#1302) Value log badgerMove entries for deleted keys are always rewritten. This leads to exponential DB size growth in heavy write/delete use cases with value pointer entries. To prevent this, a check has been added to confirm the value is still needed for badgerMove entries. (cherry picked from commit 3e1cdd9) * remove KeepL0InMemory in db_test Co-authored-by: Julian Hegler <[email protected]>
Value log badgerMove entries for deleted keys are always rewritten. This leads to exponential DB size growth in heavy write/delete use cases with value pointer entries. To prevent this, a check has been added to confirm the value is still needed for badgerMove entries.
…raph-io#1302) (dgraph-io#1502) * Confirm `badgerMove` entry required before rewrite (dgraph-io#1302) Value log badgerMove entries for deleted keys are always rewritten. This leads to exponential DB size growth in heavy write/delete use cases with value pointer entries. To prevent this, a check has been added to confirm the value is still needed for badgerMove entries. (cherry picked from commit 3e1cdd9) * remove KeepL0InMemory in db_test Co-authored-by: Julian Hegler <[email protected]>
Value log
badgerMove
entries for deleted keys are always rewritten.This leads to exponential DB size growth in heavy write/delete use cases with value pointer entries.
To prevent this, a check has been added to confirm the value is still needed for
badgerMove
entries.This change is