Change the behavior of persisted index for snapshot #410

gengliqi · 2020-12-10T10:00:13Z

Signed-off-by: gengliqi [email protected]

In previous async ready PR(#403), After getting a snapshot and calling the restore function, the persisted is changed to the index of this snapshot. This makes the persisted index does not mean the data before this index is really persisted, which is not intuitive.
The reason for this behavior is to maintain a invariant that applied <= min(committed, persisted). However, I find it's not needed if the application calls the advance_append or on_persist_ready then calls advance_apply_to. This is intuitive because the snapshot is persisted does not mean it is applied so it should be persisted first, then be applied.

Signed-off-by: gengliqi <[email protected]>

gengliqi · 2020-12-10T10:01:03Z

@BusyJay @NingLin-P PTAL, thanks.

BusyJay · 2020-12-10T13:28:19Z

So if the persisted is 2048 and snapshot resets last index back to 2040, then following new committed and not saved entries will be fetched to ready directly?

gengliqi · 2020-12-10T15:45:14Z

So if the persisted is 2048 and snapshot resets last index back to 2040, then following new committed and not saved entries will be fetched to ready directly?

It won't happen because the persisted should be changed when appending new entries.

It seems it's hard to make the persisted to be really persisted index. Maybe it's no need to change?

BusyJay · 2020-12-11T02:03:27Z

It won't happen because the persisted should be changed when appending new entries.

It's not true. Entries after snapshot won't cause conflict, so persisted won't be changed.

gengliqi · 2020-12-11T02:41:48Z

It won't happen because the persisted should be changed when appending new entries.

It's not true. Entries after snapshot won't cause conflict, so persisted won't be changed.

If there is no conflict, the persisted is also be changed.

raft-rs/src/raft_log.rs

Lines 227 to 232 in fc1ef2f

    
           let start = (conflict_idx - (idx + 1)) as usize; 
        
           self.append(&ents[start..]); 
        
           // persisted should be decreased because entries are changed 
        
           if self.persisted > conflict_idx - 1 { 
        
               self.persisted = conflict_idx - 1; 
        
           }

BusyJay · 2020-12-11T04:53:39Z

If there is no conflict, conflict_idx will be zero, so the else branch will be skipped.

raft-rs/src/raft_log.rs

Lines 217 to 236 in fc1ef2f

    
           let conflict_idx = self.find_conflict(ents); 
        
           if conflict_idx == 0 { 
        
           } else if conflict_idx <= self.committed { 
        
               fatal!( 
        
                   self.unstable.logger, 
        
                   "entry {} conflict with committed entry {}", 
        
                   conflict_idx, 
        
                   self.committed 
        
               ) 
        
           } else { 
        
               let start = (conflict_idx - (idx + 1)) as usize; 
        
               self.append(&ents[start..]); 
        
               // persisted should be decreased because entries are changed 
        
               if self.persisted > conflict_idx - 1 { 
        
                   self.persisted = conflict_idx - 1; 
        
               } 
        
           } 
        
           let last_new_index = idx + ents.len() as u64; 
        
           self.commit_to(cmp::min(committed, last_new_index)); 
        
           return Some((conflict_idx, last_new_index));

gengliqi · 2020-12-11T08:46:48Z

If there is no conflict, conflict_idx will be zero, so the else branch will be skipped.

raft-rs/src/raft_log.rs

Lines 217 to 236 in fc1ef2f

let conflict_idx = self.find_conflict(ents);

if conflict_idx == 0 {

} else if conflict_idx <= self.committed {

fatal!(

self.unstable.logger,

"entry {} conflict with committed entry {}",

conflict_idx,

self.committed

)

} else {

let start = (conflict_idx - (idx + 1)) as usize;

self.append(&ents[start..]);

// persisted should be decreased because entries are changed

if self.persisted > conflict_idx - 1 {

self.persisted = conflict_idx - 1;

}

}

let last_new_index = idx + ents.len() as u64;

self.commit_to(cmp::min(committed, last_new_index));

return Some((conflict_idx, last_new_index));

Hmmm, if it's zero, there is no new entries.

gengliqi · 2020-12-11T08:48:52Z

By the way, I think we should set persisted to make the logic more intuitive, but set it to index - 1 seems not good enough, I have no better idea.

BusyJay · 2020-12-11T11:01:24Z

How about setting it to zero, which means unknown? Or use Option<NonZeroU64> and make it None.

Signed-off-by: gengliqi <[email protected]>

gengliqi · 2020-12-17T05:00:37Z

How about setting it to zero, which means unknown? Or use Option<NonZeroU64> and make it None.

It has problems because there is an invariant that applied <= min(committed, persisted).

After thinking for a while, I believe the most suitable method is if self.persisted > self.committed { self.persisted = self.committed; } so we can say the persisted is the really persisted index all the time. You can see the comments in code for details.

Also, I add test for the case mentioned before, which is

Append some entries(index 20) and persist them
Receive a snapshot(index 10) then some new entries(index to 13)
Get a ready then check it(if there is a snapshot and some new entries can be applied, the code should panic in RawNode::ready

assert!(
    !raft
        .raft_log
        .has_next_entries_since(self.commit_since_index),
     "has snapshot but also has committed entries since {}",
     self.commit_since_index
);

gengliqi · 2020-12-17T05:21:34Z

@BusyJay @NingLin-P PTAL again, thanks.

Signed-off-by: gengliqi <[email protected]>

BusyJay

Rest LGTM

BusyJay · 2020-12-17T09:21:22Z

src/raft.rs

+        // the last index of entries from previous leader when it becomes leader
+        // (see the comments in become_leader), namely, the new persisted entries
+        // must come from this leader. Here checking the term just for robustness.
+        if update && self.state == StateRole::Leader && term == self.term {


How about using a log? Even if the term doesn't match self.term in the future adaption, for example introducing paging, it's still safe to enter the if branch.

BusyJay · 2020-12-17T09:28:11Z

src/raw_node.rs

@@ -565,6 +569,11 @@ impl<T: Storage> RawNode<T> {
            }
            let mut record = self.records.pop_front().unwrap();

+            if let Some((i, t)) = record.snapshot {
+                index = i;


Any case to cover?

Added by changing test_async_ready_follower

NingLin-P

LGTM

Signed-off-by: gengliqi <[email protected]>

This PR fixes a bug introduced by #410. Consider the case below 1. A receives a snapshot with index 10 2. A gets a ready and handles it asynchronously 3. A receives a new snapshot with index 20 4. A calls on_persist_ready for ready 1 In step 4, the persisted index can not be updated to 10 because the first_index has changed to 21 so the term check can not be passed. (details in `RaftLog::term`) I add a `maybe_persist_snap` function to fix this problem. The snapshot does not need to check the term because its data must be committed before and can not be changed in future. Signed-off-by: gengliqi <[email protected]>

This PR fixes a bug introduced by tikv/raft-rs#410. Consider the case below 1. A receives a snapshot with index 10 2. A gets a ready and handles it asynchronously 3. A receives a new snapshot with index 20 4. A calls on_persist_ready for ready 1 In step 4, the persisted index can not be updated to 10 because the first_index has changed to 21 so the term check can not be passed. (details in `RaftLog::term`) I add a `maybe_persist_snap` function to fix this problem. The snapshot does not need to check the term because its data must be committed before and can not be changed in future. Signed-off-by: gengliqi <[email protected]>

gengliqi added 2 commits December 10, 2020 15:18

change snapshot persisted

b4d6adf

Signed-off-by: gengliqi <[email protected]>

Merge branch 'master' into change-snapshot-persisted

3139b2d

update persisted

9324e81

Signed-off-by: gengliqi <[email protected]>

gengliqi force-pushed the change-snapshot-persisted branch from bb7ca46 to 9324e81 Compare December 17, 2020 04:44

cargo format

b6a29d8

Signed-off-by: gengliqi <[email protected]>

BusyJay reviewed Dec 17, 2020

View reviewed changes

NingLin-P previously approved these changes Dec 17, 2020

View reviewed changes

address comments

148e516

Signed-off-by: gengliqi <[email protected]>

gengliqi dismissed NingLin-P’s stale review via 148e516 December 17, 2020 14:07

gengliqi added 3 commits December 17, 2020 22:35

add snapshot case

c26c060

Signed-off-by: gengliqi <[email protected]>

cargo fmt

12949ab

Signed-off-by: gengliqi <[email protected]>

address comment

147108d

Signed-off-by: gengliqi <[email protected]>

gengliqi force-pushed the change-snapshot-persisted branch from f5e2215 to 147108d Compare December 17, 2020 15:50

BusyJay approved these changes Dec 17, 2020

View reviewed changes

NingLin-P approved these changes Dec 17, 2020

View reviewed changes

BusyJay merged commit a45c4a3 into tikv:master Dec 18, 2020

gengliqi mentioned this pull request Jan 22, 2021

fix persisted index bug of snapshot #417

Merged

gengliqi mentioned this pull request Jun 16, 2021

*: bump 0.6.0 #443

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change the behavior of persisted index for snapshot #410

Change the behavior of persisted index for snapshot #410

gengliqi commented Dec 10, 2020

gengliqi commented Dec 10, 2020

BusyJay commented Dec 10, 2020 •

edited

Loading

gengliqi commented Dec 10, 2020

BusyJay commented Dec 11, 2020

gengliqi commented Dec 11, 2020

BusyJay commented Dec 11, 2020

gengliqi commented Dec 11, 2020

gengliqi commented Dec 11, 2020

BusyJay commented Dec 11, 2020

gengliqi commented Dec 17, 2020

gengliqi commented Dec 17, 2020

BusyJay left a comment

BusyJay Dec 17, 2020

gengliqi Dec 17, 2020

BusyJay Dec 17, 2020

gengliqi Dec 17, 2020

NingLin-P left a comment

Change the behavior of persisted index for snapshot #410

Change the behavior of persisted index for snapshot #410

Conversation

gengliqi commented Dec 10, 2020

gengliqi commented Dec 10, 2020

BusyJay commented Dec 10, 2020 • edited Loading

gengliqi commented Dec 10, 2020

BusyJay commented Dec 11, 2020

gengliqi commented Dec 11, 2020

BusyJay commented Dec 11, 2020

gengliqi commented Dec 11, 2020

gengliqi commented Dec 11, 2020

BusyJay commented Dec 11, 2020

gengliqi commented Dec 17, 2020

gengliqi commented Dec 17, 2020

BusyJay left a comment

Choose a reason for hiding this comment

BusyJay Dec 17, 2020

Choose a reason for hiding this comment

gengliqi Dec 17, 2020

Choose a reason for hiding this comment

BusyJay Dec 17, 2020

Choose a reason for hiding this comment

gengliqi Dec 17, 2020

Choose a reason for hiding this comment

NingLin-P left a comment

Choose a reason for hiding this comment

BusyJay commented Dec 10, 2020 •

edited

Loading