Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

util/encoding: types JSON and SENTINEL_TYPE have the same value #25759

Closed
rjnn opened this issue May 21, 2018 · 1 comment
Closed

util/encoding: types JSON and SENTINEL_TYPE have the same value #25759

rjnn opened this issue May 21, 2018 · 1 comment
Labels
A-sql-encoding Relating to the SQL/KV encoding. C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior.

Comments

@rjnn
Copy link
Contributor

rjnn commented May 21, 2018

This seems... like a bug? We have an iota enum for all the types, but we special case SENTINEL_TYPE because Do not change SentinelType from 15. This value is specifically used for bit manipulation in EncodeValueTag.

But then we define JSON afterwards, it also has value 15.

Here is the output of the following code:

	fmt.Printf("%d, ", Unknown)
	fmt.Printf("%d, ", Null)
	fmt.Printf("%d, ", NotNull)
	fmt.Printf("%d, ", Int)
	fmt.Printf("%d, ", Float)
	fmt.Printf("%d, ", Decimal)
	fmt.Printf("%d, ", Bytes)
	fmt.Printf("%d, ", BytesDesc)
	fmt.Printf("%d, ", Time)
	fmt.Printf("%d, ", Duration)
	fmt.Printf("%d, ", True)
	fmt.Printf("%d, ", False)
	fmt.Printf("%d, ", UUID)
	fmt.Printf("%d, ", Array)
	fmt.Printf("%d, ", IPAddr)
	fmt.Printf("%d, ", SentinelType)
	fmt.Printf("%d, ", JSON)
 ---
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 15

There is a helpful // TODO(dan): Make this into a proto enum. above the const declaration, which is possibly worth investing in. A proto definition would have avoided these problems.

I would appreciate some commentary on what's going on here. Looking at the code paths, there isn't any overlap so there's no extant bug based on this shadowing, but we should probably bump the JSON type value up to 16,or take the opportunity to proto-ify this code.

cc @knz

@rjnn rjnn added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-sql-encoding Relating to the SQL/KV encoding. labels May 21, 2018
@knz
Copy link
Contributor

knz commented May 21, 2018

I think the code is correct. The problem is the explanation in the comment to SentinelType.

Namely:

  • SentinelType is not meant to be a special value reserved away from the list of valid type IDs
  • SentinelType does not describe a type enum value (see below)
  • what matters is that all the type enum values are distinct from each other, and that seems to still be the case (and that's why the code is correct overall)

What's going on here is that the protobuf encoding has special casing for values up to and including 15. Namely all those values are encoded in a single byte. Values above that are encoded in multiple bytes.

The constant SentinelType is meant as a conditional to mark the cut-off value that changes the protobuf encoding path.

I think the name "SentinelType" is not sufficiently descriptive, and also the comment next to it is not describing what's going on sufficiently. I have not yet thought enough about this to make an educated proposal.

@rjnn rjnn added C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. and removed C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. labels May 21, 2018
rjnn pushed a commit to rjnn/cockroach that referenced this issue Jun 5, 2018
The existence of the SentinelType, which overlaps in value with other
values in the enum confuses readers.

Fixes cockroachdb#25759.

Release note: None
rjnn pushed a commit to rjnn/cockroach that referenced this issue Jun 6, 2018
The existence of the SentinelType, which overlaps in value with other
values in the enum confuses readers.

Fixes cockroachdb#25759.

Release note: None
craig bot pushed a commit that referenced this issue Jun 6, 2018
26430: util: explicitly enumerate encoding types r=arjunravinarayan a=arjunravinarayan

The existence of the SentinelType, which overlaps in value with other
values in the enum confuses readers. Also improve the comment on SentinelType.

Fixes #25759.


26466: distsql: beef up lookup join tests r=solongordon a=solongordon

I improved our lookup join tests by adding unit tests for left outer
joins and distributing the logic tests across multiple nodes.

Fixes #25862

Release note: None

Co-authored-by: Arjun Narayan <[email protected]>
Co-authored-by: Solon Gordon <[email protected]>
@craig craig bot closed this as completed in #26430 Jun 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-encoding Relating to the SQL/KV encoding. C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior.
Projects
None yet
Development

No branches or pull requests

2 participants