sql: add SHOW FINGERPRINT TABLE with exclude_columns option #129497

msbutler · 2024-08-22T18:03:19Z

To fingerprint LDR tables, we need to exclude the hidden mvcc_origin_timestamp column. To do so, we ought to add a new option to the sql level fingerprint command that will exclude user provided columns from the fingerprint calculation. Essentially, we need to hydrate the ignoredColumns in the BuildFingerprintQueryForIndex function call.

We should also get started with calling BuildFingerprintQueryForIndex in the LDR roachtests as well. the fingerprints should match once the workload has stopped on both clusters, the LDR checkpoints are past time.now, and there's nothing in the DLQ.

Jira issue: CRDB-41568

The text was updated successfully, but these errors were encountered:

blathers-crl · 2024-08-22T18:03:20Z

cc @cockroachdb/disaster-recovery

…NGERPRINT Currently tables used for logical data replication can't be fingerprinted as they contain a column crdb_replication_origin_timestamp which will vary based on the internal MVCC timestamp. Adding an option to SHOW EXPERIMENTAL FINGERPRINT will allow LDR tables to be fingerprinted by excluding the replication column. A follow up will use this to verify tables in roachtests Release note: none Fixes: cockroachdb#129497

129766: logical: Add the EXCLUDE COLUMNS option to SHOW EXPERIMENTAL_FI… r=msbutler a=navsetlur …NGERPRINT Currently tables used for logical data replication can't be fingerprinted as they contain a column crdb_replication_origin_timestamp which will vary based on the internal MVCC timestamp. Adding an option to SHOW EXPERIMENTAL FINGERPRINT will allow LDR tables to be fingerprinted by excluding the replication column. A follow up will use this to verify tables in LDR roachtests Release note: none Fixes: #129497 129844: roachtest: remove COCKROACH_DISABLE_LEADER_FOLLOWS_LEASEHOLDER from `failover` tests r=nvanbenschoten a=nvanbenschoten Fixes #125259. This commit removes the use of COCKROACH_DISABLE_LEADER_FOLLOWS_LEASEHOLDER from the `failover/partial/lease-leader` tests. Instead of using the cluster setting to disable leader follows leaseholder, the tests now creates the kind of situation where leader and leaseholder could be split indefinitely when using epoch-based leases. This is more realistic and allows us to test lease+leader failover with leader leases when the leaseholder gets partitioned from its followers but remains connected to the liveness range. I confirmed that by default, both the epoch-based and expiration-based lease variants of this test still fail at 60s due to stuck requests. This is what we see in roachperf on master as well: - https://roachperf.crdb.dev/?filter=leader&view=failover%2Fpartial%2Flease-leader&tab=gce - https://roachperf.crdb.dev/?filter=leader&view=failover%2Fpartial%2Flease-leader%2Flease%3Dexpiration&tab=gce However, notice that the expiration-based variant temporarily improved when we enabled DistSender circuit breakers, which add a timeout to all KV requests. This is because the expiration-based lease setup can actually recover range availability, we just don't currently unwedge requests that were stuck before the recovery. This is of arguable importance, as most real-world users run with application side timeouts. Regardless, I see the same improvement if I re-enable DistSender circuit breakers. With them, `failover/partial/lease-leader` stays at 60s (infinite) recovery time while `failover/partial/lease-leader/lease=expiration` drops to recovering in ~20s. Release note: None Co-authored-by: Naveen Setlur <[email protected]> Co-authored-by: Nathan VanBenschoten <[email protected]>

msbutler added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-disaster-recovery labels Aug 22, 2024

msbutler assigned navsetlur Aug 22, 2024

blathers-crl bot added the A-disaster-recovery label Aug 22, 2024

exalate-issue-sync bot added the P-2 Issues/test failures with a fix SLA of 3 months label Aug 26, 2024

github-project-automation bot added this to Disaster Recovery Backlog Aug 28, 2024

github-project-automation bot moved this to Backlog in Disaster Recovery Backlog Aug 28, 2024

navsetlur mentioned this issue Aug 28, 2024

logical: Add the EXCLUDE COLUMNS option to SHOW EXPERIMENTAL_FI… #129766

Merged

craig bot closed this as completed in 84f5cec Aug 29, 2024

github-project-automation bot moved this from Backlog to Done in Disaster Recovery Backlog Aug 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: add SHOW FINGERPRINT TABLE with exclude_columns option #129497

sql: add SHOW FINGERPRINT TABLE with exclude_columns option #129497

msbutler commented Aug 22, 2024 •

edited

Loading

blathers-crl bot commented Aug 22, 2024

sql: add SHOW FINGERPRINT TABLE with exclude_columns option #129497

sql: add SHOW FINGERPRINT TABLE with exclude_columns option #129497

Comments

msbutler commented Aug 22, 2024 • edited Loading

blathers-crl bot commented Aug 22, 2024

msbutler commented Aug 22, 2024 •

edited

Loading