-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: properly handle DeleteRange in batches that hit WriteTooOld errors #71236
Comments
@AlexTalks and I discussed what to do about this. KV batches are never used in a way that hits this currently, and the error already means that this won't introduce silent corruption, so there's not a lot of urgency to support this. Instead, we decided to update the error message to point at this issue as a possible suspect and then place the resolution of this on hold. We also decided that when updating the error message, we'll update the errors in |
If the plan is to update an error message, can we pick that up quickly and close this out? |
Enhance the error message on sequence number errors when replaying a transactional batch with a link to the possible cause, cockroachdb#71236, stemming from an issue where a `DelRange` operation finds new keys to delete upon replay. This also changes the error from a generic error to an `AssertionFailed` error. Release note: None
The update to the error message is in #73496, however I believe that @nvanbenschoten and I discussed leaving this issue open as a reference to the actual issue here (which is that |
Enhance the error message on sequence number errors when replaying a transactional batch with a link to the possible cause, cockroachdb#71236, stemming from an issue where a `DelRange` operation finds new keys to delete upon replay. This also changes the error from a generic error to an `AssertionFailed` error. Release note: None
Enhance the error message on sequence number errors when replaying a transactional batch with a link to the possible cause, cockroachdb#71236, stemming from an issue where a `DelRange` operation finds new keys to delete upon replay. This also changes the error from a generic error to an `AssertionFailed` error. Release note: None
73496: storage: add issue for sequence number errors on replaying `DelRange` r=AlexTalks a=AlexTalks Enhance the error message on sequence number errors when replaying a transactional batch with a link to the possible cause, #71236, stemming from an issue where a `DelRange` operation finds new keys to delete upon replay. This also changes the error from a generic error to an `AssertionFailed` error. Release note: None 73578: storage: remove leftover logic related to interleaved intents r=sumeerbhola a=nvanbenschoten This commit is a follow-up to #72536. It addresses a few of the remaining items left over from removing the bulk of the interleaved intent logic. Specifically, it removes: - the `PrecedingIntentState` type - the `PrecedingIntentState` parameter in `Writer.ClearIntent` - the `Writer.OverrideTxnDidNotUpdateMetaToFalse` method - the `txnDidNotUpdateMetaHelper` type The commit does not include any behavioral changes. 73591: ui: show per-node series for "Read Amplification" and "SSTables" graphs r=dhartunian a=nvanbenschoten This commit addresses a longstanding usability issue with the Storage dashboard. Previously, the dashboard would show the average read amplification and the average sstable count across the cluster. When looking at these metrics, we are specifically interested in the outliers, so this made little sense. As a result, a few of our runbooks (e.g. [RocksDB inverted LSM](https://cockroachlabs.atlassian.net/wiki/spaces/TS/pages/1157890147/RocksDB+inverted+LSM)) require operators to grab custom graphs with the "Per Node" option. This commit fixes this by splitting these graphs out to show per-node series. _Example:_ <img width="1132" alt="Screen Shot 2021-12-07 at 10 24 14 PM" src="https://user-images.githubusercontent.com/5438456/145142909-0babdd04-54a6-46d3-9d4e-002a2d375811.png"> Co-authored-by: Alex Sarkesian <[email protected]> Co-authored-by: Nathan VanBenschoten <[email protected]>
This shouldn't have an effect because we still zero out `config.Ops.Batch` owing to cockroachdb#46081. See also cockroachdb#71236. Release note: None
We have marked this issue as stale because it has been inactive for |
Currently, batches that have
DeleteRange
may read a different set of keys to delete if they have subsequent blind writes that hit aWriteTooOld
error (and leave intents behind). If this happens and one of the keys that need to be deleted at the refreshed timestamp already has an intent on it, we won't be able to replay the batch as an intent that should be generated by theDeleteRange
won't be present in the intent history.This manifests in errors such as:
An example batch that could hit this error is below, assuming the
Put
operation hits aWriteTooOld
error:This was discovered upon adding
DeleteRange
to kvnemesis in #68003Jira issue: CRDB-10452
The text was updated successfully, but these errors were encountered: