Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: ignore truncation target range as flush not operated on time #131

Merged
merged 3 commits into from
Jul 24, 2024

Conversation

krish-nr
Copy link
Contributor

@krish-nr krish-nr commented Jul 24, 2024

Description

ignore truncation target range as flush not operated on time

Rationale

bufferlist meet an update failure when unclean shutdown happens and lead to a panic, ignore the range judgement in this scenario

Example

N/A

Changes

@krish-nr krish-nr marked this pull request as ready for review July 24, 2024 01:40
@github-actions github-actions bot requested review from bnoieh and redhdx July 24, 2024 01:40
@krish-nr krish-nr requested review from joeylichang and sysvm July 24, 2024 01:41
@will-2012
Copy link
Contributor

if _, ok := dl.buffer.(*nodebufferlist); ok {
persistentID := rawdb.ReadPersistentStateID(dl.db.diskdb)
if persistentID > limit {
oldest = persistentID - limit + 1
log.Info("Forcing prune ancient under nodebufferlist", "disk_persistent_state_id",
persistentID, "truncate_tail", oldest)
} else {
log.Info("No prune ancient under nodebufferlist, less than db config state history limit")
return ndl, nil
}
}

==>

                if _, ok := dl.buffer.(*nodebufferlist); ok {
			persistentID := rawdb.ReadPersistentStateID(dl.db.diskdb)
			if limit >= persistentID {
				log.Info("No prune ancient under nodebufferlist, less than db config state history limit", "persistent_id", persistentID, "limit", limit)
				return ndl, nil
			}
			targetOldest := persistentID - limit + 1
			realOldest, err := dl.db.freezer.Tail()
			if err == nil && targetOldest <= realOldest {
				log.Info("No prune ancient under nodebufferlist due to truncate oldest less than real oldest, which maybe happened in abnormal restart",
					"tartget_oldest_id", targetOldest, "real_oldest_id", realOldest, "error", err)
				return ndl, nil
			}
			oldest = targetOldest
			log.Info("Forcing prune ancient under nodebufferlist", "disk_persistent_state_id",
				persistentID, "truncate_tail", oldest)
		}

maybe better.

The problem probably occurs because there may be a gap between write wal and write stateid. Write wal occurs during commit to disklayer, and write stateid occurs during disklayer background flush. Therefore, stateid (ReadPersistentStateID) may be smaller than the actual wal head, and stateid - limit may be smaller than the actual wal tail.

will-2012
will-2012 previously approved these changes Jul 24, 2024
@will-2012
Copy link
Contributor

will-2012 commented Jul 24, 2024

Pls refine PR title and description. as unclean shutdown happens is misleading.

@krish-nr krish-nr merged commit 0c45fd0 into bnb-chain:develop Jul 24, 2024
1 check passed
@krish-nr krish-nr changed the title fix: ignore truncation target range as unclean shutdown happens fix: ignore truncation target range as flush not operated on time Jul 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants