Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query fails to encode on literal UUID #4289

Closed
josevalim opened this issue Sep 30, 2023 · 7 comments · Fixed by #4290
Closed

Query fails to encode on literal UUID #4289

josevalim opened this issue Sep 30, 2023 · 7 comments · Fixed by #4290
Labels

Comments

@josevalim
Copy link
Member

Elixir version

All

Database and Version

PostgreSQL (all)

Ecto Versions

All

Database Adapter and Versions (postgrex, myxql, etc)

postgrex

Current behavior

A query on a Ecto.UUID field, where the UUID is literal, emits SQL with invalid bytes:

where([s], s.uuid == "017f65d1-80bd-152d-f997-afa6dd33a00f")

The reason why this happens is because the Planner calls Ecto.UUID.dump, which returns a binary, which we then promptly inject into the query.

Ecto actually has a mechanism to deal with these, called Ecto.Query.Tagged, and the original parameter is wrapped in such a struct, but the struct is removed during the Planner.prewalk time. The only reason why this doesn't happen to binaries is because we explicit keep the tagged struct:

  defp prewalk(%Ecto.Query.Tagged{value: v, type: type} = tagged, kind, query, expr, acc, adapter) do
    if Ecto.Type.base?(type) do
      {tagged, acc}
    else
      {dump_param(kind, query, expr, v, type, adapter), acc}
    end
  end

Expected behavior

We should probably dump the param and check its underlying type, if the value is a string and the underlying type is not a string (or any), then we should probably keep the tagged struct. Note we would still need to support Tagged UUIDs in EctoSQL.

@josevalim
Copy link
Member Author

I think I was able to tackle this within EctoSQL: elixir-ecto/ecto_sql#562

@josevalim
Copy link
Member Author

Nah, the PR doesn't work, because the dump logic is used both for queries and parameters, and now we have %Ecto.Query.Tagged{...} inside parameters. We could pre-process the parameters but those can get quite expensive because they may be nested inside lists too.

My suggestion above of always tagging in the planner is also bad for similar reasons. First, we don't know which tag to use (saying it has type uuid is not quite correct - it is a binary). It seems that we'd actually need to pass more information to the adapters, so we say the value will be embedded as part of a query.

The other option is to say this is just not supported. Ecto will emit a invalid query and Postgrex should better handle the case invalid queries are given in its exceptions.

@greg-rychlewski
Copy link
Member

greg-rychlewski commented Sep 30, 2023

This looks like it is the same underlying issue as the one reported here: elixir-ecto/ecto_sql#520

I was thinking of using tagging too but when I started thinking about it more there are some pretty difficult complications:

  1. It's not just Ecto.UUID but actually any type where the dump is not compatible with the text representation I was able to reproduce it for :binary_id and I believe any custom type that has the same properties will do the same.
  2. It is actually adapter dependent. Postgrex will complain about the current behaviour but MySQL won't.

I'm not sure what the exact answer is but it seems it should be some combination of what you said here:

we'd actually need to pass more information to the adapters, so we say the value will be embedded as part of a query`

And also possibly the type needs to be able to say something about its own behaviour. So that we don't limit ourselves to only catching the types that are native to Ecto.

@greg-rychlewski
Copy link
Member

greg-rychlewski commented Oct 1, 2023

I have one thought. You could do something like dumpers/3 where the third argument says whether this is for a query parameter or for query text.

The first level control is on the adapter to say if they want to treat the type differently. For example, if they want to keep UUID as a string. And if it's a custom type that the adapter has no clue about then there is an optional callback on the type to say how it should be treated so they have a second level of control.

Something that wouldn't be covered is if the adapters have no clue about the type and the type needs to be treated differently across adapters. Then I don't know what the best option would be. Maybe for their callback to raise and ask them to interpolate. This would probably be a very rare circumstance.

Also, calling it dumpers for this purpose feels a bit off to me. But the alternative is some brand new concept that would involve changing the planner, the type module and all the adapters that doesn't do much other than fall back to dumping for most cases. I also think this might open us up to surprises in the planner. This is where we catch the scenario:

defp prewalk(%Ecto.Query.Tagged{value: v, type: type} = tagged, kind, query, expr, acc, adapter) do
  if Ecto.Type.base?(type) do
    {tagged, acc}
  else
    {dump_param(kind, query, expr, v, type, adapter), acc}
  end
end

If we change dump_param to some new thing and someone finds another reason for tagging in the future there might be an unexpected surprise. So neither way seems perfect to me.

@greg-rychlewski
Copy link
Member

I forgot to mention, an issue with tagging UUID as :binary in Postgrex is that it eventually gets turned to bytea. And Postgres will say bytea can't be compared with uuid. So you will get a readable error instead of an encoding issue but the error would be unexpected since the user is passing in a valid text representation of a uuid.

@josevalim
Copy link
Member Author

This looks like it is the same underlying issue as the one reported here: elixir-ecto/ecto_sql#520

Ah, good call.

What if we change the Postgres layer for EctoSQL so it doesn't convert Ecto.UUID to binaries? 🤔

@josevalim
Copy link
Member Author

What if we change the Postgres layer for EctoSQL so it doesn't convert Ecto.UUID to binaries? 🤔

Yes, this does not work. We indeed need to know if we are encoding text/query or a parameter. The only option forward is to indeed pass this information somehow to adapters. Either by adding a new dumpers function or have an additional parameter in dumpers. Or, alternatively, raise if we are trying to dump a tagged value and it comes back as binary/uuid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants