server: use a common logic for system log gc #90789

knz · 2022-10-27T18:45:40Z

Fixes #90521.
Subsumes #90523.
Subsumes #88766.
Informs #89453.
Informs #90790.

Prior to this patch, we had two problems:

the code to GC system.eventlog was not running on secondary tenants.

This was the original motivation for the work in this commit.
there were 2 separate implementations of GC, one for system.eventlog/rangelog and another one for system.web_sessions.

This was unfortunate because the first one is generic and was meant to be reused. Its implementation is just better all around: it is actually able to remove all rows when there are more than 1000 entries to delete at a time (this was a known limitation of the web_session cleanup code), and is optimized to avoid hitting tombstones when iterating.

This commit improves the situation by using the common, better implementation for both cases.

We acknowledge that with the advent of row-level TTL in SQL, all this would be better served by a conversion of these tables to use row-level TTL. However, the fix here (run GC in secondary tenants) needs to be backported, so we are shy to rely on the row-level TTL which didn't work as well in previous versions. The remainder of this work is described in #89453.

Release note (ops change): The cluster settings
server.web_session.purge.period and
server.web_session.purge.max_deletions_per_cycle, which were specific to the cleanup function for system.web_sessions, have been replaced by server.log_gc.period and
server.log_gc.max_deletions_per_cycle which apply to the cleanup function for system.eventlog, system.rangelog and system.web_sessions equally.

Release note (ops change): The cluster setting server.web_session.auto_logout.timeout has been removed. It had never been effective.

cockroach-teamcity · 2022-10-27T18:45:50Z

This change is

knz · 2022-10-27T18:46:27Z

cc @rafiss

Prior to this patch, we had two problems: - the code to GC system.eventlog was not running on secondary tenants. This was the original motivation for the work in this commit. - there were 2 separate implementations of GC, one for system.eventlog/rangelog and another one for system.web_sessions. This was unfortunate because the first one is generic and was meant to be reused. Its implementation is just better all around: it is actually able to remove all rows when there are more than 1000 entries to delete at a time (this was a known limitation of the web_session cleanup code), and is optimized to avoid hitting tombstones when iterating. This commit improves the situation by using the common, better implementation for both cases, and enabling it for secondary tenants. We acknowledge that with the advent of row-level TTL in SQL, all this would be better served by a conversion of these tables to use row-level TTL. However, the fix here (run GC in secondary tenants) needs to be backported, so we are shy to rely on the row-level TTL which didn't work as well in previous versions. The remainder of this work is described in cockroachdb#89453. Release note (ops change): The cluster settings `server.web_session.purge.period` and `server.web_session.purge.max_deletions_per_cycle`, which were specific to the cleanup function for `system.web_sessions`, have been replaced by `server.log_gc.period` and `server.log_gc.max_deletions_per_cycle` which apply to the cleanup function for `system.eventlog`, `system.rangelog` and `system.web_sessions` equally. Release note (ops change): The cluster setting `server.web_session.auto_logout.timeout` has been removed. It had never been effective.

dhartunian

Reviewed 6 of 11 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aadityasondhi, @abarganier, @herkolategan, @knz, and @renatolabs)

docs/generated/settings/settings.html line 81 at r2 (raw file):

<tr><td><code>server.consistency_check.max_rate</code></td><td>byte size</td><td><code>8.0 MiB</code></td><td>the rate limit (bytes/sec) to use for consistency checks; used in conjunction with server.consistency_check.interval to control the frequency of consistency checks. Note that setting this too high can negatively impact performance.</td></tr>
<tr><td><code>server.eventlog.enabled</code></td><td>boolean</td><td><code>true</code></td><td>if set, logged notable events are also stored in the table system.eventlog</td></tr>
<tr><td><code>server.eventlog.ttl</code></td><td>duration</td><td><code>2160h0m0s</code></td><td>if nonzero, entries in system.eventlog older than this duration are deleted every 10m0s. Should not be lowered below 24 hours.</td></tr>

why is the 24h directive no longer relevant?

pkg/settings/registry.go line 151 at r2 (raw file):

	// removed as of 22.2.1
	"sql.ttl.default_range_concurrency":                {},
	"server.web_session.purge.period":                  {},

I'm a bit worried about doing this because we just had an incident that was resolved by a customer setting this to a higher number than default. I'd hate for them to upgrade and have their problem resurface.

What are customers with custom values for these supposed to do? Is there any way to copy the value from a deprecated cluster setting?

knz

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aadityasondhi, @abarganier, @dhartunian, @herkolategan, and @renatolabs)

docs/generated/settings/settings.html line 81 at r2 (raw file):

Previously, dhartunian (David Hartunian) wrote…

why is the 24h directive no longer relevant?

It was never relevant. The sentiment behind is is true of many other cluster settings: don't make things poll too frequently, or performance may suffer. Yet we don't put that advice everywhere.

pkg/settings/registry.go line 151 at r2 (raw file):

we just had an incident

Do you have a link? I'd like to see the specifics. Maybe we'll find that the structure of the general-purpose GC code would avoid the original issue.

After all, nobody has complained to us about their system.eventlog GC process in a very long while.

Is there any way to copy the value from a deprecated cluster setting?

Yes, there's a way. We've done it before. But I'd rather avoid it if possible - the mechanism we have to do that is currently deprecated and we haven't yet designed the replacement.

dhartunian

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aadityasondhi, @abarganier, @herkolategan, @knz, and @renatolabs)

pkg/settings/registry.go line 151 at r2 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…

we just had an incident

Do you have a link? I'd like to see the specifics. Maybe we'll find that the structure of the general-purpose GC code would avoid the original issue.

After all, nobody has complained to us about their system.eventlog GC process in a very long while.

Is there any way to copy the value from a deprecated cluster setting?

Yes, there's a way. We've done it before. But I'd rather avoid it if possible - the mechanism we have to do that is currently deprecated and we haven't yet designed the replacement.

Here's the issue https://github.com/cockroachlabs/support/issues/1812

Seems like we could support the custom LIMIT value for web_sessions, but it's more difficult to support a different tempo just for the web_sessions table.
What do you think is best here? A customer can always have a situation where they insert faster than our defaults are culling.

knz

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aadityasondhi, @abarganier, @dhartunian, @herkolategan, and @renatolabs)

pkg/settings/registry.go line 151 at r2 (raw file):

Previously, dhartunian (David Hartunian) wrote…

Here's the issue https://github.com/cockroachlabs/support/issues/1812

Seems like we could support the custom LIMIT value for web_sessions, but it's more difficult to support a different tempo just for the web_sessions table.
What do you think is best here? A customer can always have a situation where they insert faster than our defaults are culling.

I have investigated that support issue and I'm pretty confident that the code in this PR is fully immune to the original problem.

Recall, that original problem was a combination of 3 factors:

the query was using ORDER BY random. This was the source of the CPU disruption.
the GC was only processing 10 rows at a time.
the GC was only running once per use of the login endpoint or per purge period.
The last two points together were the reason why the GC was never able to "catch up"

The code in this PR is significantly different:

it does not use ORDER BY - it uses the index on the timestamp column instead.
it loops! it processes max 1000 rows per batch, but it loops to process as many batch as necessary to advance the GC cursor to the most recent entries.

This last aspect in particular makes me believe we wouldn't even need to make the purge period / max deletions per cycle configurable. In fact, prior to this PR, they were not configurable for the GC of system.eventlog, which on large cluster also get quite a a lot of traffic.

What do you think?

dhartunian

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @aadityasondhi, @abarganier, @herkolategan, @knz, and @renatolabs)

pkg/settings/registry.go line 151 at r2 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…

I have investigated that support issue and I'm pretty confident that the code in this PR is fully immune to the original problem.

Recall, that original problem was a combination of 3 factors:

the query was using ORDER BY random. This was the source of the CPU disruption.

the GC was only processing 10 rows at a time.

the GC was only running once per use of the login endpoint or per purge period.
The last two points together were the reason why the GC was never able to "catch up"

The code in this PR is significantly different:

it does not use ORDER BY - it uses the index on the timestamp column instead.

it loops! it processes max 1000 rows per batch, but it loops to process as many batch as necessary to advance the GC cursor to the most recent entries.

This last aspect in particular makes me believe we wouldn't even need to make the purge period / max deletions per cycle configurable. In fact, prior to this PR, they were not configurable for the GC of system.eventlog, which on large cluster also get quite a a lot of traffic.

What do you think?

SGTM. Thanks for the clarification. I forgot that the ORDER BY was likely the culprit of elevated CPU.

knz · 2022-11-17T10:45:45Z

Cheers!

bors r=dhartunian

craig · 2022-11-17T11:32:59Z

Build succeeded:

Bazel Essential CI (Cockroach)

blathers-crl · 2022-11-17T11:33:22Z

Encountered an error creating backports. Some common things that can go wrong:

The backport branch might have already existed.
There was a merge conflict.
The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.

error creating merge commit from c7ce656 to blathers/backport-release-22.1-90789: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 22.1.x failed. See errors above.

error creating merge commit from c7ce656 to blathers/backport-release-22.2-90789: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 22.2.x failed. See errors above.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.}

knz requested review from dhartunian, abarganier and aadityasondhi October 27, 2022 18:45

knz requested review from a team as code owners October 27, 2022 18:45

knz requested a review from a team October 27, 2022 18:45

knz requested review from a team as code owners October 27, 2022 18:45

knz requested review from herkolategan and renatolabs and removed request for a team October 27, 2022 18:45

knz changed the title ~~server: use a commong logic for system log gc~~ server: use a common logic for system log gc Oct 27, 2022

knz force-pushed the 20221027-gc branch from eb89492 to e6f416e Compare October 27, 2022 18:46

knz force-pushed the 20221027-gc branch 3 times, most recently from a3e6360 to 327faf6 Compare October 27, 2022 19:12

knz force-pushed the 20221027-gc branch from 327faf6 to c7ce656 Compare October 27, 2022 19:28

This was referenced Oct 28, 2022

release-22.1: server: start purging web_sessions in secondary tenants #90635

Closed

release-22.2: server: start purging web_sessions in secondary tenants #90636

Closed

server: add comments and various cleanups #90608

Merged

dhartunian reviewed Nov 1, 2022

View reviewed changes

knz commented Nov 2, 2022

View reviewed changes

dhartunian reviewed Nov 8, 2022

View reviewed changes

knz commented Nov 9, 2022

View reviewed changes

dhartunian approved these changes Nov 16, 2022

View reviewed changes

knz added backport-22.1.x labels Nov 17, 2022

craig bot merged commit e8d0366 into cockroachdb:master Nov 17, 2022

This was referenced Nov 17, 2022

PR #90789 - server: use a common logic for system log gc (1 of 2) cockroachdb/docs#15639

Closed

PR #90789 - server: use a common logic for system log gc (2 of 2) cockroachdb/docs#15640

Closed

knz deleted the 20221027-gc branch August 17, 2023 19:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server: use a common logic for system log gc #90789

server: use a common logic for system log gc #90789

knz commented Oct 27, 2022 •

edited

Loading

cockroach-teamcity commented Oct 27, 2022

knz commented Oct 27, 2022

dhartunian left a comment

knz left a comment

dhartunian left a comment

knz left a comment

dhartunian left a comment

knz commented Nov 17, 2022

craig bot commented Nov 17, 2022

blathers-crl bot commented Nov 17, 2022

server: use a common logic for system log gc #90789

server: use a common logic for system log gc #90789

Conversation

knz commented Oct 27, 2022 • edited Loading

cockroach-teamcity commented Oct 27, 2022

knz commented Oct 27, 2022

dhartunian left a comment

Choose a reason for hiding this comment

knz left a comment

Choose a reason for hiding this comment

dhartunian left a comment

Choose a reason for hiding this comment

knz left a comment

Choose a reason for hiding this comment

dhartunian left a comment

Choose a reason for hiding this comment

knz commented Nov 17, 2022

craig bot commented Nov 17, 2022

blathers-crl bot commented Nov 17, 2022

knz commented Oct 27, 2022 •

edited

Loading