internal/record: write an EOF trailer on LogWriter Close #871

jbowens · 2020-09-01T17:45:33Z

While closing a LogWriter, write a special EOF record signifying the end
of the log. This EOF record will overwrite trailing garbage from the
recycled log file, allowing us to detect corruption errors in any WALs
earlier than the most recent one.

In Open return an error if any WAL other than the most recent one
encounters an invalid chunk.

Fix #864.

cockroach-teamcity · 2020-09-01T17:45:39Z

This change is

petermattis

I like this approach. Two small simplification suggestions below.

Reviewable status: 0 of 5 files reviewed, 2 unresolved discussions (waiting on @jbowens)

open.go, line 271 at r1 (raw file):

		// Invalid records are expected from recycled WALs, but only in the
		// most recent WAL.
		if err != nil && (!record.IsInvalidRecord(err) || i < len(logFiles)-1) {

I wonder if it would be cleaner to pass in lastWAL = (i == len(logFiles)-1)) as a parameter to replayWAL. That way we'd localize the error handling logic rather than having it split between replayWAL and Open.

internal/record/record.go, line 248 at r1 (raw file):

				// These chunks prevent us from misinterpreting garbage at the
				// tail of a WAL from recycling.
				if chunkType == recyclableEOFChunkType {

Rather than adding a new chunk type, I wonder if we can write a chunk that we know will be treated as io.EOF. It looks like if we write a chunk with logNum != r.logNum then we'd get this behavior. The small benefit to that is that we wouldn't be adding a new chunk type.

jbowens

Reviewable status: 0 of 6 files reviewed, 2 unresolved discussions (waiting on @petermattis)

open.go, line 271 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

I wonder if it would be cleaner to pass in lastWAL = (i == len(logFiles)-1)) as a parameter to replayWAL. That way we'd localize the error handling logic rather than having it split between replayWAL and Open.

Done.

internal/record/record.go, line 248 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Rather than adding a new chunk type, I wonder if we can write a chunk that we know will be treated as io.EOF. It looks like if we write a chunk with logNum != r.logNum then we'd get this behavior. The small benefit to that is that we wouldn't be adding a new chunk type.

I gave it a try, using zero for the log number since we start file numbers at 1. Many of tests used a zero log number, so I updated them to use a non-zero one. This approach does feels a little less obvious, but it is nice that it doesn't require a change to the log reader.

petermattis

Reviewable status: 0 of 6 files reviewed, 2 unresolved discussions (waiting on @jbowens)

internal/record/log_writer.go, line 630 at r2 (raw file):

	binary.LittleEndian.PutUint16(b.buf[i+4:i+6], 0) // Size
	b.buf[i+6] = recyclableFullChunkType
	binary.LittleEndian.PutUint32(b.buf[i+7:i+11], 0) // Log number

I think using zero here might be problematic. The issue is that the log number field is 32-bits. The 64-bit log number is truncated to 32-bits in NewLogWriter:

		// NB: we truncate the 64-bit log number to 32-bits. This is ok because a)
		// we are very unlikely to reach a file number of 4 billion and b) the log
		// number is used as a validation check and using only the low 32-bits is
		// sufficient for that purpose.

While we are very unlikely to ever reach a log number of 4 billion, it isn't impossible if there is some sort of tight loop of filenum allocations due to some error condition. Rather than using zero, can we use w.logNum+1 or ^w.logNum. I think any value that isn't w.logNum works.

internal/record/record.go, line 248 at r1 (raw file):

Previously, jbowens (Jackson Owens) wrote…

I gave it a try, using zero for the log number since we start file numbers at 1. Many of tests used a zero log number, so I updated them to use a non-zero one. This approach does feels a little less obvious, but it is nice that it doesn't require a change to the log reader.

I think using zero might be problematic. See my other comment for why and an alternative. The alternative might also remove the need to change the tests.

I think we might want to add a test for using the log num 1 << 32 so we explicitly test the case I'm worried about.

While closing a LogWriter, write a special EOF record with an incremented log number signifying the end of the log. This EOF record will overwrite trailing garbage from the recycled log file, allowing us to detect corruption errors in any logs earlier than the most recent one. In Open return an error if any WAL other than the most recent one encounters an invalid chunk.

jbowens

Reviewable status: 0 of 4 files reviewed, 2 unresolved discussions (waiting on @petermattis)

internal/record/log_writer.go, line 630 at r2 (raw file):

Previously, petermattis (Peter Mattis) wrote…

I think using zero here might be problematic. The issue is that the log number field is 32-bits. The 64-bit log number is truncated to 32-bits in NewLogWriter:
		// NB: we truncate the 64-bit log number to 32-bits. This is ok because a)
		// we are very unlikely to reach a file number of 4 billion and b) the log
		// number is used as a validation check and using only the low 32-bits is
		// sufficient for that purpose.
While we are very unlikely to ever reach a log number of 4 billion, it isn't impossible if there is some sort of tight loop of filenum allocations due to some error condition. Rather than using zero, can we use w.logNum+1 or ^w.logNum. I think any value that isn't w.logNum works.

Thanks for catching that! Fixed.

internal/record/record.go, line 248 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

I think using zero might be problematic. See my other comment for why and an alternative. The alternative might also remove the need to change the tests.

I think we might want to add a test for using the log num 1 << 32 so we explicitly test the case I'm worried about.

Done.

petermattis

Reviewable status: 0 of 4 files reviewed, all discussions resolved

jbowens · 2020-09-03T18:38:27Z

TFTR!

In cockroachdb#871, Open started interpreting any WAL older than the most recent one more strictly, erroring if it encountered any invalid chunks while reading the log. This was enabled by a new EOF trailer chunk that's written when a WAL is closed, ensuring that previous WALs end cleanly and unambiguously. However, during a WAL rotation, a crash before the previous WAL was cleanly closed could leave the previous WAL with recycled garbage at its tail. This change updates WAL rotation to close the previous WAL first, then create/rename the next WAL.

In #871, Open started interpreting any WAL older than the most recent one more strictly, erroring if it encountered any invalid chunks while reading the log. This was enabled by a new EOF trailer chunk that's written when a WAL is closed, ensuring that previous WALs end cleanly and unambiguously. However, during a WAL rotation, a crash before the previous WAL was cleanly closed could leave the previous WAL with recycled garbage at its tail. This change updates WAL rotation to close the previous WAL first, then create/rename the next WAL.

In cockroachdb#871, Open started interpreting any WAL older than the most recent one more strictly, erroring if it encountered any invalid chunks while reading the log. This was enabled by a new EOF trailer chunk that's written when a WAL is closed, ensuring that previous WALs end cleanly and unambiguously. However, during a WAL rotation, a crash before the previous WAL was cleanly closed could leave the previous WAL with recycled garbage at its tail. This change updates WAL rotation to close the previous WAL first, then create/rename the next WAL.

In #871, Open started interpreting any WAL older than the most recent one more strictly, erroring if it encountered any invalid chunks while reading the log. This was enabled by a new EOF trailer chunk that's written when a WAL is closed, ensuring that previous WALs end cleanly and unambiguously. However, during a WAL rotation, a crash before the previous WAL was cleanly closed could leave the previous WAL with recycled garbage at its tail. This change updates WAL rotation to close the previous WAL first, then create/rename the next WAL.

petermattis reviewed Sep 1, 2020

View reviewed changes

jbowens force-pushed the jackson/eof-trailer branch from eb829fd to b7e7207 Compare September 1, 2020 19:55

jbowens commented Sep 1, 2020

View reviewed changes

petermattis reviewed Sep 2, 2020

View reviewed changes

jbowens force-pushed the jackson/eof-trailer branch from b7e7207 to c748598 Compare September 2, 2020 15:42

jbowens force-pushed the jackson/eof-trailer branch from c748598 to 59020cb Compare September 2, 2020 15:44

jbowens commented Sep 2, 2020

View reviewed changes

petermattis approved these changes Sep 3, 2020

View reviewed changes

jbowens merged commit c29cc16 into cockroachdb:master Sep 3, 2020

jbowens deleted the jackson/eof-trailer branch September 3, 2020 18:38

jbowens mentioned this pull request Sep 9, 2020

db: close previous WAL before linking next WAL #886

Merged

jbowens mentioned this pull request Sep 29, 2022

WAL corruptions are silently ignored #453

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

internal/record: write an EOF trailer on LogWriter Close #871

internal/record: write an EOF trailer on LogWriter Close #871

jbowens commented Sep 1, 2020 •

edited

Loading

cockroach-teamcity commented Sep 1, 2020

petermattis left a comment

jbowens left a comment

petermattis left a comment

jbowens left a comment

petermattis left a comment

jbowens commented Sep 3, 2020

internal/record: write an EOF trailer on LogWriter Close #871

internal/record: write an EOF trailer on LogWriter Close #871

Conversation

jbowens commented Sep 1, 2020 • edited Loading

cockroach-teamcity commented Sep 1, 2020

petermattis left a comment

Choose a reason for hiding this comment

jbowens left a comment

Choose a reason for hiding this comment

petermattis left a comment

Choose a reason for hiding this comment

jbowens left a comment

Choose a reason for hiding this comment

petermattis left a comment

Choose a reason for hiding this comment

jbowens commented Sep 3, 2020

jbowens commented Sep 1, 2020 •

edited

Loading