-
Notifications
You must be signed in to change notification settings - Fork 480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
internal/record: write an EOF trailer on LogWriter Close #871
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this approach. Two small simplification suggestions below.
Reviewable status: 0 of 5 files reviewed, 2 unresolved discussions (waiting on @jbowens)
open.go, line 271 at r1 (raw file):
// Invalid records are expected from recycled WALs, but only in the // most recent WAL. if err != nil && (!record.IsInvalidRecord(err) || i < len(logFiles)-1) {
I wonder if it would be cleaner to pass in lastWAL = (i == len(logFiles)-1))
as a parameter to replayWAL
. That way we'd localize the error handling logic rather than having it split between replayWAL
and Open
.
internal/record/record.go, line 248 at r1 (raw file):
// These chunks prevent us from misinterpreting garbage at the // tail of a WAL from recycling. if chunkType == recyclableEOFChunkType {
Rather than adding a new chunk type, I wonder if we can write a chunk that we know will be treated as io.EOF
. It looks like if we write a chunk with logNum != r.logNum
then we'd get this behavior. The small benefit to that is that we wouldn't be adding a new chunk type.
eb829fd
to
b7e7207
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 6 files reviewed, 2 unresolved discussions (waiting on @petermattis)
open.go, line 271 at r1 (raw file):
Previously, petermattis (Peter Mattis) wrote…
I wonder if it would be cleaner to pass in
lastWAL = (i == len(logFiles)-1))
as a parameter toreplayWAL
. That way we'd localize the error handling logic rather than having it split betweenreplayWAL
andOpen
.
Done.
internal/record/record.go, line 248 at r1 (raw file):
Previously, petermattis (Peter Mattis) wrote…
Rather than adding a new chunk type, I wonder if we can write a chunk that we know will be treated as
io.EOF
. It looks like if we write a chunk withlogNum != r.logNum
then we'd get this behavior. The small benefit to that is that we wouldn't be adding a new chunk type.
I gave it a try, using zero for the log number since we start file numbers at 1. Many of tests used a zero log number, so I updated them to use a non-zero one. This approach does feels a little less obvious, but it is nice that it doesn't require a change to the log reader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 6 files reviewed, 2 unresolved discussions (waiting on @jbowens)
internal/record/log_writer.go, line 630 at r2 (raw file):
binary.LittleEndian.PutUint16(b.buf[i+4:i+6], 0) // Size b.buf[i+6] = recyclableFullChunkType binary.LittleEndian.PutUint32(b.buf[i+7:i+11], 0) // Log number
I think using zero here might be problematic. The issue is that the log number field is 32-bits. The 64-bit log number is truncated to 32-bits in NewLogWriter
:
// NB: we truncate the 64-bit log number to 32-bits. This is ok because a)
// we are very unlikely to reach a file number of 4 billion and b) the log
// number is used as a validation check and using only the low 32-bits is
// sufficient for that purpose.
While we are very unlikely to ever reach a log number of 4 billion, it isn't impossible if there is some sort of tight loop of filenum allocations due to some error condition. Rather than using zero, can we use w.logNum+1
or ^w.logNum
. I think any value that isn't w.logNum
works.
internal/record/record.go, line 248 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
I gave it a try, using zero for the log number since we start file numbers at 1. Many of tests used a zero log number, so I updated them to use a non-zero one. This approach does feels a little less obvious, but it is nice that it doesn't require a change to the log reader.
I think using zero might be problematic. See my other comment for why and an alternative. The alternative might also remove the need to change the tests.
I think we might want to add a test for using the log num 1 << 32
so we explicitly test the case I'm worried about.
b7e7207
to
c748598
Compare
While closing a LogWriter, write a special EOF record with an incremented log number signifying the end of the log. This EOF record will overwrite trailing garbage from the recycled log file, allowing us to detect corruption errors in any logs earlier than the most recent one. In Open return an error if any WAL other than the most recent one encounters an invalid chunk.
c748598
to
59020cb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 4 files reviewed, 2 unresolved discussions (waiting on @petermattis)
internal/record/log_writer.go, line 630 at r2 (raw file):
Previously, petermattis (Peter Mattis) wrote…
I think using zero here might be problematic. The issue is that the log number field is 32-bits. The 64-bit log number is truncated to 32-bits in
NewLogWriter
:// NB: we truncate the 64-bit log number to 32-bits. This is ok because a) // we are very unlikely to reach a file number of 4 billion and b) the log // number is used as a validation check and using only the low 32-bits is // sufficient for that purpose.
While we are very unlikely to ever reach a log number of 4 billion, it isn't impossible if there is some sort of tight loop of filenum allocations due to some error condition. Rather than using zero, can we use
w.logNum+1
or^w.logNum
. I think any value that isn'tw.logNum
works.
Thanks for catching that! Fixed.
internal/record/record.go, line 248 at r1 (raw file):
Previously, petermattis (Peter Mattis) wrote…
I think using zero might be problematic. See my other comment for why and an alternative. The alternative might also remove the need to change the tests.
I think we might want to add a test for using the log num
1 << 32
so we explicitly test the case I'm worried about.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 4 files reviewed, all discussions resolved
TFTR! |
In cockroachdb#871, Open started interpreting any WAL older than the most recent one more strictly, erroring if it encountered any invalid chunks while reading the log. This was enabled by a new EOF trailer chunk that's written when a WAL is closed, ensuring that previous WALs end cleanly and unambiguously. However, during a WAL rotation, a crash before the previous WAL was cleanly closed could leave the previous WAL with recycled garbage at its tail. This change updates WAL rotation to close the previous WAL first, then create/rename the next WAL.
In cockroachdb#871, Open started interpreting any WAL older than the most recent one more strictly, erroring if it encountered any invalid chunks while reading the log. This was enabled by a new EOF trailer chunk that's written when a WAL is closed, ensuring that previous WALs end cleanly and unambiguously. However, during a WAL rotation, a crash before the previous WAL was cleanly closed could leave the previous WAL with recycled garbage at its tail. This change updates WAL rotation to close the previous WAL first, then create/rename the next WAL.
In #871, Open started interpreting any WAL older than the most recent one more strictly, erroring if it encountered any invalid chunks while reading the log. This was enabled by a new EOF trailer chunk that's written when a WAL is closed, ensuring that previous WALs end cleanly and unambiguously. However, during a WAL rotation, a crash before the previous WAL was cleanly closed could leave the previous WAL with recycled garbage at its tail. This change updates WAL rotation to close the previous WAL first, then create/rename the next WAL.
In cockroachdb#871, Open started interpreting any WAL older than the most recent one more strictly, erroring if it encountered any invalid chunks while reading the log. This was enabled by a new EOF trailer chunk that's written when a WAL is closed, ensuring that previous WALs end cleanly and unambiguously. However, during a WAL rotation, a crash before the previous WAL was cleanly closed could leave the previous WAL with recycled garbage at its tail. This change updates WAL rotation to close the previous WAL first, then create/rename the next WAL.
In #871, Open started interpreting any WAL older than the most recent one more strictly, erroring if it encountered any invalid chunks while reading the log. This was enabled by a new EOF trailer chunk that's written when a WAL is closed, ensuring that previous WALs end cleanly and unambiguously. However, during a WAL rotation, a crash before the previous WAL was cleanly closed could leave the previous WAL with recycled garbage at its tail. This change updates WAL rotation to close the previous WAL first, then create/rename the next WAL.
While closing a LogWriter, write a special EOF record signifying the end
of the log. This EOF record will overwrite trailing garbage from the
recycled log file, allowing us to detect corruption errors in any WALs
earlier than the most recent one.
In Open return an error if any WAL other than the most recent one
encounters an invalid chunk.
Fix #864.