Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve boolean support in query #515

Closed
3 tasks done
josevalim opened this issue Feb 21, 2023 · 6 comments
Closed
3 tasks done

Improve boolean support in query #515

josevalim opened this issue Feb 21, 2023 · 6 comments

Comments

@josevalim
Copy link
Member

josevalim commented Feb 21, 2023

The goal is to support this:

DF.mutate(foo: select(a == b, c, 0))

As well as multi-clause:

DF.mutate(foo: select do
  a == b -> c
  d < e -> f
  true -> 0
end)

We chose select do instead of cond because, opposite to cond which only evaluates the clauses that matches, queries always evaluates all clauses in order to build the expression.

In order to support this, we will need:

  1. To allow any value as the second and third arguments of select (the first argument will always be a list). In this case, it should be enough to call from_list([...]) on arguments that are not lists
  2. Provide and/2, or/2, and not/1
  3. Implement and document select/1 macro defined above

The reason why we decided to handle this at the query level is because Series.and(series, false) does not really have a purpose and we should keep and/2 and in Explorer.Series remain strict about expecting series. We may want to lift this restriction in the future but, at a first glance, it sounds reasonable.

@josevalim
Copy link
Member Author

@sasikumar87 is it correct to say your PR does the third item in the list above? I also think the first item is already done (but not the second).

@sasikumar87
Copy link
Contributor

@josevalim yes, third item in the list above.

@josevalim
Copy link
Member Author

josevalim commented Sep 15, 2023

@sasikumar87 do you want to send a PR for the second item?

The idea is to introduce functions for and, or, and not. and/2 would look like this:

def left and right when Kernel.and(is_boolean(left), is_boolean(right)), do: Kernel.and(left, right)
def left and right, do: Explorer.Series.and(boolean!(left), boolean!(right))

defp boolean!(%Series{dtype: :boolean} = series) do
  series
end

defp boolean!(other) do
  raise ArgumentError, "boolean operators require either a boolean (true/false) or a boolean series, got: #{inspect(other)}"
end

It is untested but that's the general idea. :)

@sasikumar87
Copy link
Contributor

@josevalim will do

@sasikumar87
Copy link
Contributor

@josevalim to confirm for and/2, we should also support one of them being a boolean and other being series, right?

@josevalim
Copy link
Member Author

Ah yes, I forgot a clause in boolean! That accepts a Boolean and converts it a series!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants