Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix comments #277

Merged
merged 1 commit into from
Sep 8, 2014
Merged

fix comments #277

merged 1 commit into from
Sep 8, 2014

Conversation

wankai
Copy link
Contributor

@wankai wankai commented Sep 8, 2014

No description provided.

igorcanadi added a commit that referenced this pull request Sep 8, 2014
@igorcanadi igorcanadi merged commit 02d5bff into facebook:master Sep 8, 2014
BusyJay pushed a commit to BusyJay/rocksdb that referenced this pull request Jul 25, 2022
Ref facebook#277

When the iterator read keys in reverse order, each Prev() function cost O(log n) times. So I add prev pointer for every node in skiplist to improve the Prev() function.

Signed-off-by: Little-Wallace [email protected]

Implemented new virtual functions:
- `InsertWithHintConcurrently`
- `FindRandomEntry`

Signed-off-by: tabokie <[email protected]>
rockeet pushed a commit to rockeet/rocksdb that referenced this pull request Oct 5, 2024
…facebook#277)

A new per-file entity called epoch number was introduced in v7.10, which is used to sort L0 files (previously sorted by largest_seqno). We need to make sure this number matches in leader follower. This is quite tricky during leaf deploy.

* leader is on new version while follower is on old version. This is fine actually. epoch number will just be skipped on follower
* follower is on new version while leader is on old version. This is the tricky one and this entire PR is to deal with this case.
The hack we use is to make follower to ignore epoch number from leader but simply calculating the epoch number on the fly. There are two cases:

* flush which generates L0 files. epoch number is allocated based on next_epoch_number of each CF. The L0 files are sorted based on largest seqno.
* compaction which merges files in lower levels to higher levels. epoch number = min epoch number of input files.
This is mostly fine, except when db is reopened. Rocksdb doesn't track next_epoch_number in manifest file. When db is
reopened, it calculates the next_epoch_number based on max epoch number of existing live files. So it's possible for the next_epoch_number to go backwards when db is reopened. Following two cases are possible:

* leader reopens db, causing next_epoch_number on leader to go backwards. So follower needs to rewind it.
* follower reopens db, causing next_epoch_number on follower to go backwards. So follower needs to advance it.
A new replication option: AR_RESET_IF_EPOCH_MISMATCH is added to help with this issue. rewind and advance are handled carefully to make sure it doesn't break existing file ordering.

The change in this PR is quite hacky, but fortunately we can remove most of them once rocksdb is fully upgraded
rockeet pushed a commit to rockeet/rocksdb that referenced this pull request Oct 5, 2024
Follow up for: facebook#277

We've been running with epoch number replication for weeks. Time to delete the unnecessary hacky code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants