-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SELECT from a table with hash sharded PK has unexpected index join in query plan #67170
Comments
I'm pretty sure that the problem is that we treat that bytes->string cast as not immutable. I'm not totally sure why this is -- I think it's to deal with cases where the string type has a width. That check is here:
then cockroach/pkg/sql/sem/tree/casts.go Line 176 in 9a11535
The options are: Rework the generated expression to not cast bytes but rather hash them directly (seems good) or to special case the cast logic to make it immutable if the string has infinite width (also seems good IIUC). The former seems safer and easier. |
Yeah so seems like this is a problem for the following types: |
It's because it depends on a knob ( Similarly, for FLOAT there is some precision setting ( TIMESTAMPTZ::STRING contains the current timezone. It can be fixed by doing By the way, the way to find out these things is to find the case in String conversion is a bad idea in other cases too: you can have decimals 1.0 and 1.00 and their strings are different but they are equal. Similar issue with collated strings. Maybe we should extend |
Or, perhaps more elegant - have a |
Alternatively we could expose a function that turns any type into bytes using its key encoding. |
Jinx |
Great minds think alike (and apparently rarely sleep). |
Assigning this to schema for triage -- let us know if you need any help from SQL Queries here. |
The function we should use to encode the bytes is this one: cockroach/pkg/sql/rowenc/column_type_encoding.go Lines 42 to 52 in 2a91e74
|
I took a stab at this by first adding the function and then trying to hook it up. What I discovered when testing is that we still don't properly propagate the functional dependencies for a bunch of types. The crux of the problem, I think can be summarized with the following example: CREATE TABLE intkey (
k INT4, shard INT4 AS (mod(fnv32((k::INTERVAL::STRING)), 8)) STORED,
PRIMARY KEY (shard, k),
CONSTRAINT c CHECK (shard IN (0, 1, 2, 3, 4, 5, 6, 7))
);
CREATE TABLE floatkey (
k FLOAT4, shard INT4 AS (mod(fnv32((k::INTERVAL::STRING)), 8)) STORED,
PRIMARY KEY (shard, k),
CONSTRAINT c CHECK (shard IN (0, 1, 2, 3, 4, 5, 6, 7))
); We use
Best guess of the moment is that it's something in cockroach/pkg/sql/opt/xform/select_funcs.go Line 393 in b4fad78
cockroach/pkg/sql/opt/xform/general_funcs.go Line 240 in b4fad78
cc @mgartner |
What is the problem in that example? Do you have an example of a bad query plan?
A bit off-topic, but this might change in a future release (we're trying to add support for some "interval style" setting, see #67792). Shouldn't we go down the |
Oh, is the problem that the int version works as intended and float doesn't? I think it may have to do with float being a "composite type". This is so silly and annoying that it hurts me to type it. Composite types are types where you can have non-identical values that are equal. Collated strings are the canonical example but easier to see are decimals (1.0 = 1.00 but they present differently). Float is composite because apparently there is such a thing as |
Yes, I only posted the
Yes |
The optimizer has the concept of whether an expression is "composite-sensitive" or not. We could special case the |
It seems like the
If so, maybe we can adjust our composite-sensitive logic to understand that certain operators are not composite-sensitive even if their input is. |
Yeah, for sure, that was a distraction. I have what I need and am making good progress. Thanks for all the help! |
Note that this only applies when the input is just a variable. In general, the input expression might truly take on unequal values (e.g. |
Isn't this example only composite sensitive because As a comparison, consider |
I've added this logic and some plumbing to the relevant check and now my expression is no longer "sensitive". It still isn't getting used properly. // Some functions are insensitive to the composite nature of their
// arguments. In this case, we want to allow composite variable references
// to be treated insensitively but all other exprs should get the usual
// treatment.
//
// This will deal with cases like:
//
// crdb_internal.key_encode(f4) -- not sensitive
// crdb_internal.key_encode(IF(f4::STRING = '-0', 0, 1)) -- sensitive
//
if funcExpr, ok := e.(*FunctionExpr); ok && funcExpr.Properties.CompositeInsensitive {
args := funcExpr.Args
for i, n := 0, args.ChildCount(); i < n; i++ {
if _, isVariable := args.Child(i).(*VariableExpr); isVariable {
continue
}
if canBeSensitive(args.Child(i)) {
return true
}
}
return false
} |
With @RaduBerinde's help narrowed it down to cockroach/pkg/sql/opt/xform/general_funcs.go Line 317 in 6cdb10c
Will clean up my PR a bit. The root of the problem is that we never fold constant expressions for types that might be composite. Consider:
vs.
|
What I'm saying is that |
I didn't know how to deal with that one without more machinery. |
Suppose I have a table with hash sharded PK and a secondary index as follow:
Running the following query for the table
t
contains an unexpected index join:However, if the column type for
k2
is changed fromBYTES
toSTRING
, then the query plan will only scan the primary index:Epic: CRDB-7363
The text was updated successfully, but these errors were encountered: