Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backupccl: write table stats in order in backup metadata sst #86806

Closed
msbutler opened this issue Aug 24, 2022 · 2 comments · Fixed by #94992
Closed

backupccl: write table stats in order in backup metadata sst #86806

msbutler opened this issue Aug 24, 2022 · 2 comments · Fixed by #94992
Assignees
Labels
A-disaster-recovery C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-disaster-recovery

Comments

@msbutler
Copy link
Collaborator

msbutler commented Aug 24, 2022

#86078 caused the BackupMetadataSST writer to write out of order kvs related to table stats, causing the pebble writer to error, as seen in #86289. One working theory for why this occurs:

Backup currently pulls the stats out of the stats cache and then constructs keys to write into the SST. After #86078, the cache now returns stats per-column stats; however, the keys backup generates look like they are just based on the table id and the stats id. Now, perhaps each key is no longer unique when you put the per-columns stat forecasts in the mix.

Jira issue: CRDB-18943

@msbutler msbutler added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Aug 24, 2022
@blathers-crl
Copy link

blathers-crl bot commented Aug 24, 2022

cc @cockroachdb/bulk-io

msbutler added a commit to msbutler/cockroach that referenced this issue Aug 24, 2022
This patch sets write_metadata_sst cluster setting to false in prep for the
22.2 branch cut, as there's additional worked required before this feature gets
used in production. Unit tests may continue to write the MetadataSST because of
a new MetamorphicTestBool.

Setting this to false will also stop the roachtest in cockroachdb#86289 from consistently
failing due to cockroachdb#86806.

Fixes cockroachdb#86289

Release note: none

Release justification: prevents using an experimental feature by default
msbutler added a commit to msbutler/cockroach that referenced this issue Aug 25, 2022
This patch sets write_metadata_sst cluster setting to false in prep for the
22.2 branch cut, as there's additional work required before this feature gets
used in production. Unit tests may continue to write the MetadataSST because of
a new MetamorphicTestBool.

Setting this to false will also stop the roachtest in cockroachdb#86289 from consistently
failing due to cockroachdb#86806.

Fixes cockroachdb#86289

Release note: none

Release justification: prevents using an experimental feature by default
craig bot pushed a commit that referenced this issue Aug 25, 2022
86608: batcheval: add latch key protecting range key stats update r=erikgrinaker a=aliher1911

Previously GC needed to get a read latch with max timestamp to
ensure that range tombstones are not modified during GC. This
is causing all writers to get stuck in queue while GC is validating
request and removing range tombstone.
This commit adds a dedicated latch key
LocalRangeRangeTombstoneStatsUpdateLockSuffix to address the problem.
All range tombstone writers obtain this read latch on top of the write
latches for the ranges they are interested to update.
GC on the other hand will obtain write latch on that key. This
approach allows point writers to proceed during GC, but will block new
range tombstones from being written. Non conflicting writes of range
tombstones could still proceed since their write latch ranges don't
overlap.

Release justification: this is a safe change as range tombstone
behaviour is not enabled yet and the change is needed to address
potential performance regressions.

Release note: None

86645: kvserver: log when raft send/recv queue fills up r=pavelkalinnikov a=tbg

Inspired by https://github.com/cockroachlabs/support/issues/1770.

If either the raft send or receive queue fills up, wide-spread outages
can occur as replication progress stalls. We have metrics that can
indicate this, but straightforward logging is also appropriate to direct
attention to the fact, which this commit achieves.

Touches #79755

Release justification: important logging improvement
Release note: None


86679: server,ui: show internal stats with new cluster setting r=maryliag a=maryliag

Previously, we were not showing internal results on
fingerprint options on SQL Activity.
A new cluster setting created `sql.stats.response.show_internal`
can be set to `true` and internal statistics will be
displayed on SQL Activity page.

Fixes #79547

https://www.loom.com/share/1b89ba99a7c247edadb5c8b0d127755c

Release justification: low risk, high benefit change
Release note (sql change): New cluster setting
`sql.stats.response.show_internal` with default value of `false`
can be set to true, to display information about internal stats
on SQL Activity page, with fingerprint option.

86748: storage: rename `MVCCRangeKeyStack.FirstAbove/Below` r=tbg a=erikgrinaker

This patch renames `FirstAbove/Below` to `FirstAtOrAbove/Below`, for
clarity.

Release justification: bug fixes and low-risk updates to new functionality

Release note: None

86809: backupccl: set kv.bulkio.write_metadata_sst.enabled to default false r=stevendanna a=msbutler

This patch sets write_metadata_sst cluster setting to false in prep for the
22.2 branch cut, as there's additional worked required before this feature gets
used in production.

Setting this to false will also stop the roachtest in #86289 from consistently
failing due to #86806.

Fixes #86289

Release note: none

Release justification: prevents using an experimental feature by default

Co-authored-by: Oleg Afanasyev <[email protected]>
Co-authored-by: Tobias Grieger <[email protected]>
Co-authored-by: Marylia Gutierrez <[email protected]>
Co-authored-by: Erik Grinaker <[email protected]>
Co-authored-by: Michael Butler <[email protected]>
@ajwerner
Copy link
Contributor

This lead to a bors failure here.

msbutler added a commit to msbutler/cockroach that referenced this issue Dec 16, 2022
This test occasionally flakes due to cockroachdb#86806. To prevent the flakiness, this
patch manually sets the kv.bulkio.write_metadata_sst.enabled cluster setting to
false. When cockroachdb#86806  gets addressed, this patch should be reverted.

Epic: None

Release note: None
craig bot pushed a commit that referenced this issue Dec 19, 2022
93837: backupccl: deflake TestClusterRestoreFailCleanup r=stevendanna a=msbutler

This test occasionally flakes due to #86806. To prevent the flakiness, this patch manually sets the kv.bulkio.write_metadata_sst.enabled cluster setting to false. When #86806  gets addressed, this patch should be reverted.

Epic: None

Release note: None

93897: kvnemesis: ignore `SysBytes:10` MVCC stats discrepancy r=erikgrinaker a=erikgrinaker

Resolves #93890.
Touches #93896.
Touches #93312.
Touches #86542.

Epic: none
Release note: None

Co-authored-by: Michael Butler <[email protected]>
Co-authored-by: Erik Grinaker <[email protected]>
tbg pushed a commit that referenced this issue Dec 19, 2022
This test occasionally flakes due to #86806. To prevent the flakiness, this
patch manually sets the kv.bulkio.write_metadata_sst.enabled cluster setting to
false. When #86806  gets addressed, this patch should be reverted.

Epic: None

Release note: None
stevendanna added a commit to stevendanna/cockroach that referenced this issue Jan 10, 2023
Statistics forecasts are not persisted and are reconstructed on demand
from the persisted statistics. As such, they don't need to be included
in a backup or restored.

This has the additional effect of solving a bug in which we would fail
to write statistics into our metadata SST if it included more than 1
forecast since the in-memory forecasts have a 0 statistic_id.

Fixes cockroachdb#86806

Epic: none

Release note: None
stevendanna added a commit to stevendanna/cockroach that referenced this issue Jan 11, 2023
Statistics forecasts are not persisted and are reconstructed on demand
from the persisted statistics. As such, they don't need to be included
in a backup or restored.

This has the additional effect of solving a bug in which we would fail
to write statistics into our metadata SST if it included more than 1
forecast since the in-memory forecasts have a 0 statistic_id.

Fixes cockroachdb#86806

Epic: none

Release note: None
stevendanna added a commit to stevendanna/cockroach that referenced this issue Jan 11, 2023
Statistics forecasts are not persisted and are reconstructed on demand
from the persisted statistics. As such, they don't need to be included
in a backup or restored.

This has the additional effect of solving a bug in which we would fail
to write statistics into our metadata SST if it included more than 1
forecast since the in-memory forecasts have a 0 statistic_id.

Fixes cockroachdb#86806

Epic: none

Release note: None
craig bot pushed a commit that referenced this issue Jan 13, 2023
94992: backupccl: do not backup or restore stat forecasts r=michae2 a=stevendanna

Statistics forecasts are not persisted and are reconstructed on demand from the persisted statistics. As such, they don't need to be included in a backup or restored.

This has the additional effect of solving a bug in which we would fail to write statistics into our metadata SST if it included more than 1 forecast since the in-memory forecasts have a 0 statistic_id.

Fixes #86806

Epic: none

Release note: None

Co-authored-by: Steven Danna <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-disaster-recovery C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-disaster-recovery
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants