-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stop using protos as the expected value in CPut #38308
Labels
A-kv-client
Relating to the KV client and the KV interface.
C-cleanup
Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior.
Comments
danhhz
added
C-cleanup
Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior.
A-kv-client
Relating to the KV client and the KV interface.
labels
Jun 19, 2019
danhhz
added a commit
to danhhz/cockroach
that referenced
this issue
Jun 19, 2019
In cockroachdb#38147, a new non-nullable field was added to the ReplicaDescriptor proto that is serialized inside RangeDescriptors. RangeDescriptors are updated using CPut to detect races. This means that if a RangeDescriptor had been written by an old version of cockroach and we then attempt to update it, the CPut will fail because the encoded version of it is different (non-nullable proto2 fields are always included). A similar issue was introduced in cockroachdb#38004 which made the StickyBit field on RangeDescriptor non-nullable. We could keep the fields as nullable, but this has allocation costs (and is also more annoying to work with). Worse, the proto spec is pretty explicit about not relying on serialization of a given message to always produce exactly the same bytes: From https://developers.google.com/protocol-buffers/docs/encoding#implications Do not assume the byte output of a serialized message is stable. So instead, we've decided to stop CPut-ing protos and to change the interface of CPut to take the expected value as bytes. To CPut a proto, the encoded value will be read from kv, passed around with the decoded struct, and used in the eventual CPut. This work is upcoming, but the above two PRs are real breakages, so this commit fixes those first. Neither of these PRs made the last alpha so no release note. Touches cockroachdb#38308 Release note: None
danhhz
added a commit
to danhhz/cockroach
that referenced
this issue
Jun 20, 2019
In cockroachdb#38147, a new non-nullable field was added to the ReplicaDescriptor proto that is serialized inside RangeDescriptors. RangeDescriptors are updated using CPut to detect races. This means that if a RangeDescriptor had been written by an old version of cockroach and we then attempt to update it, the CPut will fail because the encoded version of it is different (non-nullable proto2 fields are always included). A similar issue was introduced in cockroachdb#38004 which made the StickyBit field on RangeDescriptor non-nullable. We could keep the fields as nullable, but this has allocation costs (and is also more annoying to work with). Worse, the proto spec is pretty explicit about not relying on serialization of a given message to always produce exactly the same bytes: From https://developers.google.com/protocol-buffers/docs/encoding#implications Do not assume the byte output of a serialized message is stable. So instead, we've decided to stop CPut-ing protos and to change the interface of CPut to take the expected value as bytes. To CPut a proto, the encoded value will be read from kv, passed around with the decoded struct, and used in the eventual CPut. This work is upcoming, but the above two PRs are real breakages, so this commit fixes those first. Neither of these PRs made the last alpha so no release note. Touches cockroachdb#38308 Release note: None
craig bot
pushed a commit
that referenced
this issue
Jun 20, 2019
38302: storage: CPut bytes instead of a proto when updating RangeDescriptors r=tbg a=danhhz In #38147, a new non-nullable field was added to the ReplicaDescriptor proto that is serialized inside RangeDescriptors. RangeDescriptors are updated using CPut to detect races. This means that if a RangeDescriptor had been written by an old version of cockroach and we then attempt to update it, the CPut will fail because the encoded version of it is different (non-nullable proto2 fields are always included). A similar issue was introduced in #38004 which made the StickyBit field on RangeDescriptor non-nullable. We could keep the fields as nullable, but this has allocation costs (and is also more annoying to work with). Worse, the proto spec is pretty explicit about not relying on serialization of a given message to always produce exactly the same bytes: From https://developers.google.com/protocol-buffers/docs/encoding#implications Do not assume the byte output of a serialized message is stable. So instead, we've decided to stop CPut-ing protos and to change the interface of CPut to take the expected value as bytes. To CPut a proto, the encoded value will be read from kv, passed around with the decoded struct, and used in the eventual CPut. This work is upcoming, but the above two PRs are real breakages, so this commit fixes those first. Neither of these PRs made the last alpha so no release note. Touches #38308 Closes #38183 Release note: None Co-authored-by: Daniel Harrison <[email protected]>
danhhz
added a commit
to danhhz/cockroach
that referenced
this issue
Aug 31, 2019
The signature of CPut was changed so the expected value is now *roachpb.Value instead of interface{}. This was primarily motivated by protos because 1) the protobuf spec specifically mentions not to rely on the serialized version being stable and 2) we had actual issues with backward compatibility when trying to add new non-nullable fields as described in cockroachdb#38308. Instead, we now use the roachpb.Value bytes actually fetched from kv. Touches cockroachdb#38308 Release note: None
danhhz
added a commit
to danhhz/cockroach
that referenced
this issue
Aug 31, 2019
The signature of CPut was changed so the expected value is now *roachpb.Value instead of interface{}. This was primarily motivated by protos because 1) the protobuf spec specifically mentions not to rely on the serialized version being stable and 2) we had actual issues with backward compatibility when trying to add new non-nullable fields as described in cockroachdb#38308. Instead, we now use the roachpb.Value bytes actually fetched from kv. Touches cockroachdb#38308 Release note: None
danhhz
added a commit
to danhhz/cockroach
that referenced
this issue
Sep 4, 2019
The signature of CPut was changed so the expected value is now *roachpb.Value instead of interface{}. This was primarily motivated by protos because 1) the protobuf spec specifically mentions not to rely on the serialized version being stable and 2) we had actual issues with backward compatibility when trying to add new non-nullable fields as described in cockroachdb#38308. Instead, we now use the roachpb.Value bytes actually fetched from kv. Touches cockroachdb#38308 Release note: None
danhhz
added a commit
to danhhz/cockroach
that referenced
this issue
Sep 4, 2019
The signature of CPut was changed so the expected value is now *roachpb.Value instead of interface{}. This was primarily motivated by protos because 1) the protobuf spec specifically mentions not to rely on the serialized version being stable and 2) we had actual issues with backward compatibility when trying to add new non-nullable fields as described in cockroachdb#38308. Instead, we now use the roachpb.Value bytes actually fetched from kv. Touches cockroachdb#38308 Release note: None
craig bot
pushed a commit
that referenced
this issue
Sep 4, 2019
40357: storage/engine: fix unnecessary C.ulonglong conversion r=nvanbenschoten a=nvanbenschoten Something recently shook up the build, causing the following lint warnings to start firing: ``` --- FAIL: TestLint/TestRoachLint (56.77s) lint_test.go:77: /Users/nathan/Go/src/github.com/cockroachdb/cockroach/pkg/storage/engine/rocksdb.go:2216:35: unnecessary conversion lint_test.go:77: /Users/nathan/Go/src/github.com/cockroachdb/cockroach/pkg/storage/engine/rocksdb.go:2217:35: unnecessary conversion ``` Release note: None 40397: *: migrate more uses of CPutDeprecated r=tbg,dt a=danhhz The signature of CPut was changed so the expected value is now *roachpb.Value instead of interface{}. This was primarily motivated by protos because 1) the protobuf spec specifically mentions not to rely on the serialized version being stable and 2) we had actual issues with backward compatibility when trying to add new non-nullable fields as described in #38308. Instead, we now use the roachpb.Value bytes actually fetched from kv. Touches #38308 Release note: None Co-authored-by: Nathan VanBenschoten <[email protected]> Co-authored-by: Daniel Harrison <[email protected]>
The problem has been fixed for |
andreimatei
added a commit
to andreimatei/cockroach
that referenced
this issue
May 15, 2020
Fixes cockroachdb#38308 Updates to the liveness records are done as CPuts. Before this patch, the CPuts' expected value was the re-marshaled proto with the previous version. As cockroachdb#38308 explains, that's a bad practice since it prevents the proto's encoding to change in any way (e.g. fields can't be removed). Instead, this patch moves to keeping track of the raw bytes that have been read from the DB, besides the unmarshalled liveness protos. The raw bytes are used as the expected values. Release note: None
andreimatei
added a commit
to andreimatei/cockroach
that referenced
this issue
May 19, 2020
Fixes cockroachdb#38308 Updates to the liveness records are done as CPuts. Before this patch, the CPuts' expected value was the re-marshaled proto with the previous version. As cockroachdb#38308 explains, that's a bad practice since it prevents the proto's encoding to change in any way (e.g. fields can't be removed). Instead, this patch moves to keeping track of the raw bytes that have been read from the DB, besides the unmarshalled liveness protos. The raw bytes are used as the expected values. Release note: None
andreimatei
added a commit
to andreimatei/cockroach
that referenced
this issue
May 21, 2020
Fixes cockroachdb#38308 Updates to the liveness records are done as CPuts. Before this patch, the CPuts' expected value was the re-marshaled proto with the previous version. As cockroachdb#38308 explains, that's a bad practice since it prevents the proto's encoding to change in any way (e.g. fields can't be removed). Instead, this patch moves to keeping track of the raw bytes that have been read from the DB, besides the unmarshalled liveness protos. The raw bytes are used as the expected values. Release note: None
irfansharif
pushed a commit
to irfansharif/cockroach
that referenced
this issue
Jun 12, 2020
Fixes cockroachdb#38308 Updates to the liveness records are done as CPuts. Before this patch, the CPuts' expected value was the re-marshaled proto with the previous version. As cockroachdb#38308 explains, that's a bad practice since it prevents the proto's encoding to change in any way (e.g. fields can't be removed). Instead, this patch moves to keeping track of the raw bytes that have been read from the DB, besides the unmarshalled liveness protos. The raw bytes are used as the expected values. Release note: None
irfansharif
pushed a commit
to irfansharif/cockroach
that referenced
this issue
Jun 17, 2020
Fixes cockroachdb#38308 Updates to the liveness records are done as CPuts. Before this patch, the CPuts' expected value was the re-marshaled proto with the previous version. As cockroachdb#38308 explains, that's a bad practice since it prevents the proto's encoding to change in any way (e.g. fields can't be removed). Instead, this patch moves to keeping track of the raw bytes that have been read from the DB, besides the unmarshalled liveness protos. The raw bytes are used as the expected values. Release note: None
andreimatei
added a commit
to andreimatei/cockroach
that referenced
this issue
Jun 26, 2020
Fixes cockroachdb#38308 Updates to the liveness records are done as CPuts. Before this patch, the CPuts' expected value was the re-marshaled proto with the previous version. As cockroachdb#38308 explains, that's a bad practice since it prevents the proto's encoding to change in any way (e.g. fields can't be removed). Instead, this patch moves to keeping track of the raw bytes that have been read from the DB, besides the unmarshalled liveness protos. The raw bytes are used as the expected values. Release note: None
craig bot
pushed a commit
that referenced
this issue
Jun 29, 2020
49135: kvserver: don't use marshaled proto for liveness CPut r=andreimatei a=andreimatei Fixes #38308 Updates to the liveness records are done as CPuts. Before this patch, the CPuts' expected value was the re-marshaled proto with the previous version. As #38308 explains, that's a bad practice since it prevents the proto's encoding to change in any way (e.g. fields can't be removed). Instead, this patch moves to keeping track of the raw bytes that have been read from the DB, besides the unmarshalled liveness protos. The raw bytes are used as the expected values. Release note: None 50658: geopb: rename Shape to ShapeType r=sumeerbhola a=otan Rename Shape to ShapeType, in preparation for a new Shape protobuf. Release note: None 50671: build: upgrade to Go 1.14 r=petermattis a=otan Minor fixes required: * Upgrade go.mod to prevent build.Import errors. * Updated instructions to upgrade Go. * Updated .pb.go files - these are tied with the gofmt of the environment and we cannot fix this. * Update storage use cast DBSlice to DBString or it fails S1016 of staticcheck. * Make lint fail if the import directory is not found. * Fix acceptance test to include GOROOT or else build.Import fails from within the acceptance tests. The acceptance test is weirdly set up to run a Go binary from the outside container. Checklist: * [x] Adjust version in Docker image ([source](./builder/Dockerfile#L199-L200)). * [x] Rebuild the Docker image and bump the `version` in `builder.sh` accordingly ([source](./builder.sh#L6)). * [x] Bump the version in `go-version-check.sh` ([source](./go-version-check.sh)), unless bumping to a new patch release. * [x] Bump the default installed version of Go in `bootstrap-debian.sh` ([source](./bootstrap/bootstrap-debian.sh#L40-42)). * [ ] Update the `builder.dockerImage` parameter in the TeamCity [`Cockroach`](https://teamcity.cockroachdb.com/admin/editProject.html?projectId=Cockroach&tab=projectParams) and [`Internal`](https://teamcity.cockroachdb.com/admin/editProject.html?projectId=Internal&tab=projectParams) projects. (I will do the last step after this has merged) Resolves #48737. Release note (general change): Bump the Go version to 1.14. 50675: httputil: remove req_go112.go r=knz a=otan Release note: None Co-authored-by: Andrei Matei <[email protected]> Co-authored-by: Oliver Tan <[email protected]>
craig bot
pushed a commit
that referenced
this issue
Aug 1, 2022
84420: sql/logictests: don't fail parent test when using retry r=knz a=stevendanna testutils.SucceedsSoon calls Fatal() on the passed testing.T. Here, we were calling SucceedsSoon with the root T, which in the case of a subtest resulted in this error showing up in our logs testing.go:1169: test executed panic(nil) or runtime.Goexit: subtest may have called FailNow on a parent test This moves to using SucceedsSoonError so that we can process errors using the same formatting that we use elsewhere. Release note: None 85120: roachpb: make range/replica descriptor fields non-nullable r=pavelkalinnikov a=erikgrinaker This patch makes all fields in range/replica descriptors non-nullable, fixing a long-standing TODO. Additionally, this fixes a set of bugs where code would use regular comparison operators (e.g. `==`) to compare replica descriptors, which with nullable pointer fields would compare the memory locations rather than the values. One practical consequence of this was that the DistSender would fail to use a new leaseholder with a non-`VOTER` type (e.g. `VOTER_INCOMING`), instead continuing to try other replicas before backing off. However, there are further issues surrounding this bug and it will be addressed separately in a way that is more amenable to backporting. The preparatory work for this was done in ea720e3. Touches #85060. Touches #38308. Touches #38465. Release note: None 85352: opt: revert data race fix r=mgartner a=mgartner This commit reverts #37972. We no longer lazily build filter props and share them across multiple threads. Release note: None Co-authored-by: Steven Danna <[email protected]> Co-authored-by: Erik Grinaker <[email protected]> Co-authored-by: Marcus Gartner <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-kv-client
Relating to the KV client and the KV interface.
C-cleanup
Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior.
#38147 and #38004 each introduced a regression that broke RangeDescriptor updates (replica changes, splits, merges) after a cluster was upgraded from a prior version because of a bad interaction between CPut and protos.
Both bugs where the same general idea: A non-nullable field was added to RangeDescriptor (or ReplicaDescriptor, which is stored on RangeDescriptor). If a RangeDescriptor was written by an old version of cockroach, it wouldn't have the new field. Whenever we update RangeDescriptor, we do it with CPut to catch races. So a new binary would read the old serialized descriptor from kv (which doesn't have the new field), parse it into a RangeDescriptor, compute the update, then try to CPut the new RangeDescriptor. Proto2 includes all fields in the serialization, including ones with the default value, so it would include the new field. This wouldn't match what was in kv and the CPut would return a ConditionFailedError.
We could work around these two bugs by adding the fields as nullable, but that has an allocation cost. Worse though, the proto spec is pretty explicit about not relying on serialization of a given message to always produce exactly the same bytes. From https://developers.google.com/protocol-buffers/docs/encoding#implications
Instead, what we should do is fetch the raw roachpb.Value from kv for the old RangeDescriptor, parse it into a RangeDescriptor and use that to compute the update, but also pass along the actual roachpb.Value we got. Then the CPut will use the roachpb.Value, which is guaranteed to match. If we need to do any equality of protos, we should be doing it using the Equal method on the parsed versions.
Further, there are more places where we CPut using a parsed proto message as the expected value, which are also footguns waiting to happen. We'll fix all of them and prevent future occurrences by changing the signature of CPut to accept a
*roachpb.Value
instead ofinterface{}
for the expected value.The text was updated successfully, but these errors were encountered: