Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changefeedccl: Emit span resolved event when end time reached #109439

Merged
merged 5 commits into from
Sep 11, 2023

Conversation

miretskiy
Copy link
Contributor

Changefeed supports a mode where the user wants to emit
all events that occurred since some time in the past (cursor),
and end the changefeed (`end_time) at the time in the near future.

In this mode, the rangefeed catchup scan starting from cursor
position could take some time -- maybe even a lot of time --
and in this case, the very first checkpoint kvfeed will observe
will be after end_time. All of the events, including
checkpoints after end_time are skipped, as they should.

However, this meant that no changefeed checkpoint
records could be produced until entire changefeed completes.
This PR ensures that once the end_time is reached, we will
emit 1 "resolved event" for that span, so that changefeed
can produce span based checkpoint if needed.

Fixes #108464

Release note: None

@miretskiy miretskiy requested review from a team as code owners August 24, 2023 17:07
@miretskiy miretskiy requested a review from a team August 24, 2023 17:07
@miretskiy miretskiy requested a review from a team as a code owner August 24, 2023 17:07
@miretskiy miretskiy requested a review from a team August 24, 2023 17:07
@miretskiy miretskiy requested review from a team as code owners August 24, 2023 17:07
@miretskiy miretskiy requested review from herkolategan, srosenberg and Santamaura and removed request for a team August 24, 2023 17:07
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@miretskiy
Copy link
Contributor Author

@jayshrivastava -- ignore first commit; it's being reviewed at #109346

@miretskiy miretskiy requested review from erikgrinaker and removed request for a team, herkolategan, srosenberg and Santamaura August 24, 2023 17:08
Copy link
Contributor

@jayshrivastava jayshrivastava left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker and @miretskiy)


-- commits line 40 at r3:
typo: improve


pkg/ccl/changefeedccl/kvevent/event.go line 100 at r3 (raw file):

	// number of event types.
	numEventTypes = 3

I think we should do TypeResolved + 1


pkg/ccl/changefeedccl/kvevent/event.go line 131 at r3 (raw file):

		return int(TypeKV)
	default:
		return int(TypeResolved)

Do you think we could add a case for Unknown? The blocking buf should never see an event of type unknown bc the kvfeed will check before adding, but maybe someone in the future will call this without doing the check first.


pkg/ccl/changefeedccl/changefeed_test.go line 6873 at r4 (raw file):

		// Don't care much about the values emitted (tested elsewhere) -- all
		// we want to make sure is that the feed terminates.

We should check if we get resolved timestamps at the boundary when end_time is reached. Can use the filter span with mutation knob.


pkg/ccl/changefeedccl/kvevent/blocking_buffer_test.go line 180 at r3 (raw file):

	require.NoError(t, buf.CloseWithReason(context.Background(), context.Canceled))

	require.EqualValues(t, 0, metrics.AllocatedMem.Value())

I think we should assert that the numKV, numResolved, numFlush metrics are reset too. You probably need to store in flight counters to do so.

Copy link
Contributor Author

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker and @jayshrivastava)


pkg/ccl/changefeedccl/kvevent/event.go line 100 at r3 (raw file):

Previously, jayshrivastava (Jayant) wrote…

I think we should do TypeResolved + 1

Done.


pkg/ccl/changefeedccl/kvevent/event.go line 131 at r3 (raw file):

Previously, jayshrivastava (Jayant) wrote…

Do you think we could add a case for Unknown? The blocking buf should never see an event of type unknown bc the kvfeed will check before adding, but maybe someone in the future will call this without doing the check first.

made it more explicit + log.warning in case of unknown event.


pkg/ccl/changefeedccl/changefeed_test.go line 6873 at r4 (raw file):

Previously, jayshrivastava (Jayant) wrote…

We should check if we get resolved timestamps at the boundary when end_time is reached. Can use the filter span with mutation knob.

Good idea -- done.


pkg/ccl/changefeedccl/kvevent/blocking_buffer_test.go line 180 at r3 (raw file):

Previously, jayshrivastava (Jayant) wrote…

I think we should assert that the numKV, numResolved, numFlush metrics are reset too. You probably need to store in flight counters to do so.

i'll pass for now. I tried that. First, numKV/flush/resolved are counters, not gauges.
And even those things that are counters (like Acquired/Released counters) are updated only when
release is called. I realized that release sometimes is not called -- for example when buffer is closed w/out releasing all buffered events.
Plus, there is fair bit of variability in these flush events being generated.

Point is -- this test was becoming more and more elaborate; and it's not clear if trying to get all metrics correctly reset was proving to be more trouble than worth.
Instead, I focused on the allocated memory -- that one is correct to be reset because whether or not
I release events, the memory is returned when closing quota.

I suspect we'll have to revisit this issue since I'm not 100% convinced that we're not generating excessive number of flush requests.

Copy link
Contributor

@jayshrivastava jayshrivastava left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Ty for working on this

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker and @miretskiy)


pkg/ccl/changefeedccl/changefeed_test.go line 6920 at r5 (raw file):

		// After changefeed completes, verify we have seen all ranges emit resolved
		// event with end_time timestamp.  That is: verify frontier.Frontier() is at end_time.
		expectedFrontier := endTime.Prev()

For my own understanding, can you explain why it's Prev()?


pkg/ccl/changefeedccl/kvevent/blocking_buffer_test.go line 180 at r3 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

i'll pass for now. I tried that. First, numKV/flush/resolved are counters, not gauges.
And even those things that are counters (like Acquired/Released counters) are updated only when
release is called. I realized that release sometimes is not called -- for example when buffer is closed w/out releasing all buffered events.
Plus, there is fair bit of variability in these flush events being generated.

Point is -- this test was becoming more and more elaborate; and it's not clear if trying to get all metrics correctly reset was proving to be more trouble than worth.
Instead, I focused on the allocated memory -- that one is correct to be reset because whether or not
I release events, the memory is returned when closing quota.

I suspect we'll have to revisit this issue since I'm not 100% convinced that we're not generating excessive number of flush requests.

Ack!

@miretskiy
Copy link
Contributor Author

For my own understanding, can you explain why it's Prev()?

We always treated end_time exclusive. So, when end time boundary reached, we emit "EXIT" span boundary at boundary.Prev.

Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not too familiar with the CDC parts, so only skimmed this, but overall LGTM.

Reviewed 3 of 8 files at r1, 1 of 1 files at r2, 4 of 5 files at r3, 7 of 9 files at r5.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jayshrivastava and @miretskiy)


pkg/cmd/roachtest/tests/cdc_bench.go line 201 at r5 (raw file):

	// (and release allocations), resulting in significant drop in completed ranges throughput.
	settings.ClusterSettings["kv.rangefeed.closed_timestamp_refresh_interval"] = "5s"
	settings.ClusterSettings["changefeed.memory.per_changefeed_limit"] = "4G"

Shouldn't we benchmark out-of-the-box performance, since that's what users will typically experience? If the defaults are bad, then we should fix the defaults, not fudge the benchmark.

If we really need this, I'd argue it should be a separate variant, so that we benchmark both default performance and tuned performance -- it's important that we have a realistic view of what the real-world performance is.


pkg/cmd/roachtest/tests/cdc_bench.go line 215 at r5 (raw file):

	// due to elevated goroutine scheduling latency.
	// Default is 2s; make it 3x.
	settings.Env = append(settings.Env, "COCKROACH_NETWORK_TIMEOUT=6s")

Fixed on master, see #109578.

@miretskiy
Copy link
Contributor Author

miretskiy commented Sep 7, 2023

Shouldn't we benchmark out-of-the-box performance, since that's what users will typically experience? If the defaults are bad, then we should fix the defaults, not fudge the benchmark.

If we really need this, I'd argue it should be a separate variant, so that we benchmark both default performance and tuned performance -- it's important that we have a realistic view of what the real-world performance is.

Good points, @erikgrinaker . I will test to see if this benchmark passes with the new closed timestamp
(3s vs 200ms before). I will however, keep the other setting. I think the argument here is that this is a
large scale changefeed; so, the fan in is just pretty crazy, which is why you may want to bump up your defaults,
and some of the clients do that.

	// due to elevated goroutine scheduling latency.
	// Default is 2s; make it 3x.
	settings.Env = append(settings.Env, "COCKROACH_NETWORK_TIMEOUT=6s")

Fixed on master, see #109578.

Nice to know; I'm going to try to see if 4s is sufficient.
Decided to keep the line and the comment for posterity in case we see this benchmark flaking.

@miretskiy
Copy link
Contributor Author

@erikgrinaker -- verified benchmark still passes once I removed some of the settings you pointed out (timeout of 4 sec is fine, and 3s closed ts is also fine)

Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The typical user isn't going to change any of these settings unless we tell them to. I think we should focus on the default case, and make the default case better if necessary. We can have separate tuned benchmark variants if we want to look at their effect. Your call though, these are CDC's benchmarks.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jayshrivastava and @miretskiy)

Yevgeniy Miretskiy added 5 commits September 8, 2023 16:00
Fix data race in management of `InCatchup` range
information.

Epic: None

Release note: None
Use `quotapool.Inf()` as the throttle limit
instead of max int64 to ensure that when throttler
is desabled, attempts to acquire quota are roughly free.

Epic: None

Release note: None
Improve blocking buffer observability by adding:
  * `changefeed.buffer_entries.allocated_memory` -- gauge keeping track
    of currently allocated memory for the events added to the blocking buffer.
  * `changefeed.buffer_entries.<event_type>` -- counters keeping
   track of the number of event type events (flush, KV, resolved) added.

Informs cockroachdb#108464

Release note: None
Changefeed supports a mode where the user wants to emit
all events that occurred since some time in the past (`cursor`),
and end the changefeed (`end_time) at the time in the near future.

In this mode, the rangefeed catchup scan starting from `cursor`
position could take some time -- maybe even a lot of time --
and in this case, the very first checkpoint kvfeed will observe
will be after `end_time`.  All of the events, including
checkpoints after `end_time` are skipped, as they should.

However, this meant that no changefeed checkpoint
records could be produced until entire changefeed completes.
This PR ensures that once the `end_time` is reached, we will
emit 1 "resolved event" for that span, so that changefeed
can produce span based checkpoint if needed.

Fixes cockroachdb#108464

Release note: None
Re-enable regular rangefeed catchup benchmark over 100k ranges.
Adjust cdc bench configuration to ensure the benchmark completes
in reasonable time.

Fixes cockroachdb#108157

Release note: None
@miretskiy
Copy link
Contributor Author

bors r+

@craig
Copy link
Contributor

craig bot commented Sep 11, 2023

Build succeeded:

@craig craig bot merged commit 0d073f4 into cockroachdb:master Sep 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cdc: Lagging span checkpoint with end_time option might be broken
4 participants