sql: use row-level TTL for rangelog, eventlog and web_sessions #89453

aadityasondhi · 2022-10-05T23:06:32Z

The system.web_session table is periodically purged. This is handled in purge_auth_seession.go and is controlled using cluster settings. An alternative (possibly better) would be to have the table configured with row-level ttl and have that manage these web_sessions.

Things to consider

handling deleteOldExpiredSessionsStmt vs deleteOldRevokedSessionsStmt vs deleteSessionsAutoLogoutStmt
we want the TTL to be configurable by the user, without a schema change on the table. This means possibly that the TTL expression should be able to refer to a cluster setting.

related: #88766

Jira issue: CRDB-20259

Epic CRDB-21265

The text was updated successfully, but these errors were encountered:

rafiss · 2022-10-06T02:48:45Z

It would probably benefit from doing a bit of up-front investigation, since there might be some complexity with getting the jobs framework to operate on a system table.

Prior to this patch, we had two problems: - the code to GC system.eventlog was not running on secondary tenants. This was the original motivation for the work in this commit. - there were 2 separate implementations of GC, one for system.eventlog/rangelog and another one for system.web_sessions. This was unfortunate because the first one is generic and was meant to be reused. Its implementation is just better all around: it is actually able to remove all rows when there are more than 1000 entries to delete at a time (this was a known limitation of the web_session cleanup code), and is optimized to avoid hitting tombstones when iterating. This commit improves the situation by using the common, better implementation for both cases, and enabling it for secondary tenants. We acknowledge that with the advent of row-level TTL in SQL, all this would be better served by a conversion of these tables to use row-level TTL. However, the fix here (run GC in secondary tenants) needs to be backported, so we are shy to rely on the row-level TTL which didn't work as well in previous versions. The remainder of this work is described in cockroachdb#89453. Release note (ops change): The cluster settings `server.web_session.purge.period` and `server.web_session.purge.max_deletions_per_cycle`, which were specific to the cleanup function for `system.web_sessions`, have been replaced by `server.log_gc.period` and `server.log_gc.max_deletions_per_cycle` which apply to the cleanup function for `system.eventlog`, `system.rangelog` and `system.web_sessions` equally. Release note (ops change): The cluster setting `server.web_session.auto_logout.timeout` has been removed. It had never been effective.

90789: server: use a common logic for system log gc r=dhartunian a=knz Fixes #90521. Subsumes #90523. Subsumes #88766. Informs #89453. Informs #90790. Prior to this patch, we had two problems: - the code to GC system.eventlog was not running on secondary tenants. This was the original motivation for the work in this commit. - there were 2 separate implementations of GC, one for system.eventlog/rangelog and another one for system.web_sessions. This was unfortunate because the first one is generic and was meant to be reused. Its implementation is just better all around: it is actually able to remove all rows when there are more than 1000 entries to delete at a time (this was a known limitation of the web_session cleanup code), and is optimized to avoid hitting tombstones when iterating. This commit improves the situation by using the common, better implementation for both cases. We acknowledge that with the advent of row-level TTL in SQL, all this would be better served by a conversion of these tables to use row-level TTL. However, the fix here (run GC in secondary tenants) needs to be backported, so we are shy to rely on the row-level TTL which didn't work as well in previous versions. The remainder of this work is described in #89453. Release note (ops change): The cluster settings `server.web_session.purge.period` and `server.web_session.purge.max_deletions_per_cycle`, which were specific to the cleanup function for `system.web_sessions`, have been replaced by `server.log_gc.period` and `server.log_gc.max_deletions_per_cycle` which apply to the cleanup function for `system.eventlog`, `system.rangelog` and `system.web_sessions` equally. Release note (ops change): The cluster setting `server.web_session.auto_logout.timeout` has been removed. It had never been effective. Co-authored-by: Raphael 'kena' Poss <[email protected]>

aadityasondhi added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-observability-inf labels Oct 5, 2022

blathers-crl bot added the A-observability-inf label Oct 5, 2022

aadityasondhi mentioned this issue Oct 5, 2022

server: fix bottlenecks when purging system.web_sessions #88766

Merged

rafiss mentioned this issue Oct 24, 2022

server: start purging web_sessions in secondary tenants #90523

Merged

aadityasondhi mentioned this issue Oct 27, 2022

sql: use row-level TTL for rangelog, eventlog and web_sessions #90787

Closed

knz changed the title ~~server: consider row-level ttl for system.web_sessions table~~ sql: use row-level TTL for rangelog, eventlog and web_sessions Oct 27, 2022

knz mentioned this issue Oct 27, 2022

server: use a common logic for system log gc #90789

Merged

exalate-issue-sync bot added T-observability and removed T-observability-inf labels Mar 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: use row-level TTL for rangelog, eventlog and web_sessions #89453

sql: use row-level TTL for rangelog, eventlog and web_sessions #89453

aadityasondhi commented Oct 5, 2022 •

edited by exalate-issue-sync bot

Loading

rafiss commented Oct 6, 2022

sql: use row-level TTL for rangelog, eventlog and web_sessions #89453

sql: use row-level TTL for rangelog, eventlog and web_sessions #89453

Comments

aadityasondhi commented Oct 5, 2022 • edited by exalate-issue-sync bot Loading

rafiss commented Oct 6, 2022

aadityasondhi commented Oct 5, 2022 •

edited by exalate-issue-sync bot

Loading