Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Ignore all write_only vindexes when query planning #7336

Open
jmoldow opened this issue Jan 21, 2021 · 1 comment
Open

Feature Request: Ignore all write_only vindexes when query planning #7336

jmoldow opened this issue Jan 21, 2021 · 1 comment

Comments

@jmoldow
Copy link
Contributor

jmoldow commented Jan 21, 2021

Feature Description

Right now, the Vitess query planner / EXPLAIN information has no way to distinguish between a vindex that, for a given parametrized query:

  • Happens to be hitting all shards for the requested parameter values and for the current data in the lookup tables; vs.
  • Is always guaranteed to scatter to all shards, no matter the parameter values and the data in the lookup tables

This explains the behavior observed in #7328:

  • EXPLAIN FORMAT=vitess reported SelectEqualUnique because, as far as it knew, a unique vindex was being used on a single value.
  • vtexplain reported that the query would go to all shards, because Map for the write_only lookup vindex returned a list containing all the shards.

If instead the Vitess query planner had special support for understanding and handling write_only, then such vindexes could be ignored entirely during query planning. Then, rather than being unable to tell between the two cases above, the vindex would fall into a third category. And these problematic queries would fall through to all-shards scatter queries, which would be notated appropriately in EXPLAIN FORMAT=vitess, vtexplain, and /debug/scatter_stats.

As a bonus, the tools EXPLAIN FORMAT=vitess, vtexplain, and /debug/scatter_stats should let you know that a candidate vindex is available but still in write_only mode. That would make the situation even more clear.

Use Case(s)

Fix for #7328.

@systay systay self-assigned this Jan 21, 2021
@jmoldow
Copy link
Contributor Author

jmoldow commented Jan 21, 2021

One other interesting thing here.

Suppose there is a query with WHERE a=? AND b=?. Suppose this query currently utilizes a Lookup NonUnique vindex on column a.

If one runs CreateLookupVindex to create a Lookup Unique on column b, then the new vindex will have a lower cost https://vitess.io/docs/reference/features/vindexes/#cost , and will therefore be chosen.

But because the new vindex is write_only, the Map function will return all shards, leading to an all-shards scatter. Whereas if the original Lookup NonUnique vindex on column a was chosen, it wouldn't necessarily scatter to all-shards.

Whereas if write_only vindexes were ignored by the planner, the original Lookup NonUnique vindex will continue to be used, until ExternalizeVindex is run on the new vindex.

I haven't tried to repro this, but it makes sense, based on what I know, that this bug would manifest as described.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants