-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachpb: investigate disabling roachpb.Value checksum computation #21435
Comments
I did some experimentation with this, both by simply disabling checksum computation and keeping the first 4 bytes in each value empty in nvanbenschoten/someChecksum and by removing the entire 4-byte prefix of each value entirely in nvanbenschoten/noChecksum. I then ran
While leaving the checksum field in On the other hand, removing the entire 4-byte checksum field from
This seems like a pretty big change and undoubtedly has some nasty subtleties, so I don't think there's much to do here for 2.0. |
Heh, I was about to take a look at this. I think it makes sense to disabling checksum computation completely, but leave the 4-byte checksum field blank. Note that the 0-value for checksum indicates "checksum uninitialized", so I don't think it will be problematic to turn on later. Also, we're still verifying the checksums when a |
If this doesn't complicate turning checksum verification back on then I'm fine with disabling the checksum computation. As you mentioned, the only complication here would be if we created |
The below Raft thing might be a deal breaker. I think you should try to hook into the below-raft protos test to really be sure we're never calling |
A grep shows that the only call to
This also means that the change won't even require a new cluster version, since any uninitialized checksums are simply ignored. |
We don't marshal
Yep. |
Good point. Hooking into The first caller of The second caller of |
…State key Fixes cockroachdb#13392. Unblocks cockroachdb#21435. This change creates a new replicated range-local key called the RangeAppliedState key. The key holds a combination of the Raft applied index, the lease applied index, and the MVCC stats. Each of these pieces of metadata is written on every single Raft application, so combining them into single RocksDB key reduces the required number of keys touched on each Raft application down from 3 to 1. The applied indices and the stats used to be stored separately in different keys. We now deem those keys to be "legacy" because they have been replaced by the range applied state key. However, in order to maintain compatibility with existing clusters without requiring a migration, we allow them to exist side-by-side with the new range applied state key. Because we need to support an upgrade path from these legacy keys to the single new key, we can continue to write to the legacy keys as long as the range applied state key does not yet exist. We take advantage of this in two ways: 1. The existence of these legacy keys, even if they are out of date, turns out to be useful in cases where we need to synthesize their up-to-date values (snapshots, stats computations, and consistency checks). The reason for this is that the legacy keys can serve as markers for where these legacy values need to be synthesized and injected during MVCC iteration. This is all handled by a new iterator type called MigrationIter, which translates new keys that may not exist on all replicas in a mixed-version cluster into their corresponding legacy representations. 2. They existence of the keys also helps keeps the MVCCStats consistent across pre- and post-"range applied state" versions of Cockroach. For these reasons, we still require that these legacy keys be written during the initial bootstrapping of each range. We then wait for the first Raft application to "upgrade" from the use of these keys to the use of the new range applied state key. Preliminary benchmarking of this change confirms the expected performance improvement. When running `./kv --read-percent=0` locally I see a 5-8% improvement in throughput and 4-6% reduction in avg latency. The change is still a WIP because the changes made to ComputeStatsGo needs to be ported over to C++. This shouldn't be particularly challenging because unlike in Go where we needed to synthesize the legacy keys, in C++ we just need to make sure we account for their impact on the stats. It also needs more mixed-version cluster testing like that done in cockroachdb#21120. Release note (performance improvement): Fewer disk writes are required for each database write, increasing write throughput and reducing write latency.
Fixes cockroachdb#21435. This change disables roachpb.Value checksum computation, leaving the checksum field blank on roachpb.Values. See cockroachdb#21435 (comment) for details on the measured performance improvement from this change. Release note (performance improvement): Unnecessary value checksums are no longer computed, speeding up database writes.
Fixes cockroachdb#21435. This change disables roachpb.Value checksum computation, leaving the checksum field blank on roachpb.Values. See cockroachdb#21435 (comment) for details on the measured performance improvement from this change. Release note (performance improvement): Unnecessary value checksums are no longer computed, speeding up database writes.
Fixes cockroachdb#21435. This change disables roachpb.Value checksum computation, leaving the checksum field blank on roachpb.Values. See cockroachdb#21435 (comment) for details on the measured performance improvement from this change. Release note (performance improvement): Unnecessary value checksums are no longer computed, speeding up database writes.
We have marked this issue as stale because it has been inactive for |
#21395 disabled verification of the
roachpb.Value
checksums, providing a 15-20% speed boost to large MVCC scans. Checksums are still computed when values are written, they are just not verified. The justification for removing the verification is that TLS and RocksDB checksums already provide adequate coverage. RocksDB checksums have caught disk corruption in the wild. Theroachpb.Value
checksums never have.Computation of the checksums was left in because disabling them may be problematic. We'd need to verify that checksums are never computed below Raft or perform a migration. But before doing any of that work, we should verify that disabling the checksum provides a speed boost. Workloads to investigate are simple KV operations are bulk insert operations.
@nvanbenschoten for triage.
Jira issue: CRDB-5886
The text was updated successfully, but these errors were encountered: