Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for map as expression #855

Merged
merged 1 commit into from
Feb 14, 2024

Conversation

lkarthee
Copy link
Member

@lkarthee lkarthee commented Feb 11, 2024

Add struct equivalent

iex(2)> df = DF.new(%{a: [1, nil, 3], b: ["a", "b", nil]})
#Explorer.DataFrame<
  Polars[3 x 2]
  a s64 [1, nil, 3]
  b string ["a", "b", nil]
>
iex(3)> DF.mutate(df, c: %{a: a, b: b, lit: 1, null: is_nil(a)})
#Explorer.DataFrame<
  Polars[3 x 3]
  a s64 [1, nil, 3]
  b string ["a", "b", nil]
  c struct[4] [
    %{"a" => 1, "b" => "a", "lit" => 1, "null" => false},
    %{"a" => nil, "b" => "b", "lit" => 1, ...},
    %{"a" => 3, "b" => nil, ...}
  ]
>

Note: Hiding original text as it is stale as per #855 (comment)

Original text Add `struct` expression.
df = DF.new(%{a: [1, 2, 3], b: ["a", "b", "c"]})
#Explorer.DataFrame<
  Polars[3 x 2]
  a s64 [1, 2, 3]
  b string ["a", "b", "c"]
>
DF.mutate(df, c: struct([a: a, b: b]))
#Explorer.DataFrame<
  Polars[3 x 3]
  a s64 [1, 2, 3]
  b string ["a", "b", "c"]
  c struct[2] [
    %{"a" => 1, "b" => "a"},
    %{"a" => 2, "b" => "b"},
    %{"a" => 3, "b" => "c"}
  ]
>

Explorer.Series.struct(a: df["a"], b: df["b"])
#Explorer.Series<
  Polars[3]
  struct[2] [
    %{"a" => 1, "b" => "a"},
    %{"a" => 2, "b" => "b"},
    %{"a" => 3, "b" => "c"}
  ]
>

@lkarthee lkarthee force-pushed the struct_expr branch 2 times, most recently from e4f1814 to 6b4d645 Compare February 11, 2024 07:58
lib/explorer/series.ex Outdated Show resolved Hide resolved
@josevalim
Copy link
Member

Could we implement this without adding a struct function? Could we automatically convert maps to structs instead?

@lkarthee lkarthee changed the title add struct expression add support for map as expression Feb 11, 2024
@josevalim josevalim requested a review from philss February 11, 2024 15:56
@lkarthee
Copy link
Member Author

lkarthee commented Feb 12, 2024

@josevalim currently %{} works in mutate as a top-level expression, but fails if it is input to any series function.

DF.mutate(df, c: %{a: a, b: b}) # works
DF.mutate(df, c: %{a: is_nil(a), b: is_nil(b)}) # works
DF.mutate(df, c: is_nil(%{a: a, b: b})) # fails


** (ArgumentError) expected a series as argument for is_nil, got: %{a: #Explorer.Series<
    LazySeries[???]
    s64 (column("a"))
  >, b: #Explorer.Series<
    LazySeries[???]
    s64 (column("b"))
  >}
    (explorer 0.9.0-dev) lib/explorer/series.ex:6127: Explorer.Series.apply_series/3

How to tackle this ?

@josevalim
Copy link
Member

I am on my phone, but somewhere in lazy series we handle all literals, we should probably add map handling in there. The code will probably be pretty similar to the one you added to data frame, so we should probably find a way of sharing those as well.

@josevalim
Copy link
Member

I took a Quick Look and I was wrong. We only allow casting in specific operations in series.ex. For example, we could begin supporting maps in the comparison operators, if comparison is supported between structs. Outside of that, we most likely won’t support passing maps. There may be an argument we should allow literal (such as integers and maps) on is_nil, but that’s probably not the case today

@lkarthee
Copy link
Member Author

lkarthee commented Feb 12, 2024

There are some convenient use cases of structs - https://docs.pola.rs/user-guide/expressions/structs/#practical-use-cases-of-struct-columns .

Should we support passing struct to a series ? These would add value to mutating, filtering without mutating, etc

@josevalim
Copy link
Member

The question is: which operations should we support in on? For example, it doesn't make sense to support them on add or multiply. So I'd do operation per operation, at least initially.

@lkarthee
Copy link
Member Author

Ok, let me explore more on this question and come back later.

I think this PR is complete for now.

lib/explorer/data_frame.ex Show resolved Hide resolved
lib/explorer/data_frame.ex Outdated Show resolved Hide resolved
@lkarthee lkarthee force-pushed the struct_expr branch 2 times, most recently from 3c20a80 to 49847b9 Compare February 14, 2024 09:21
@josevalim josevalim merged commit 2dc0062 into elixir-explorer:main Feb 14, 2024
4 checks passed
@josevalim
Copy link
Member

💚 💙 💜 💛 ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants