-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: initial work to add support for list dtypes #401
Conversation
This is work-in-progress and adds the bases of "list" dtypes. It also adds the `to_list/1` lazy operation that takes a series and creates a list series with the elements of that series. This is useful for aggregations, when you want to capture the elements of a given group. Eg.: ```elixir df = Explorer.DataFrame.new(a: [1, 1, 2, 2], b: [9, 8, 7, 6]) grouped = Explorer.DataFrame.group_by(df, :a) Explorer.DataFrame.summarise_with(grouped, fn df -> [b_merged: Explorer.Series.to_list(df["b"])] end) ``` The result is going to be something like this: #Explorer.DataFrame< Polars[2 x 2] a integer [1, 2] b_merged list(integer) [[ 9, 8 ], [ 7, 6 ]] > Related to: - elixir-explorer#296 - elixir-explorer#400
I didn't think too much about the inspecting, but I would like suggestions 😃 |
@@ -27,7 +27,7 @@ defmodule Explorer.Series do | |||
|
|||
@valid_dtypes Explorer.Shared.dtypes() | |||
|
|||
@type dtype :: :integer | :float | :boolean | :string | :date | :datetime | |||
@type dtype :: :integer | :float | :boolean | :string | :date | :datetime | {:list, :integer} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably want to make it recursive further on:
@type dtype :: :integer | :float | :boolean | :string | :date | :datetime | {:list, :integer} | |
@type dtype :: :integer | :float | :boolean | :string | :date | :datetime | {:list, dtype} |
I'm closing this for now in order to focus in the next release. But anyone that wants to give it a try, feel free to cherry-pick this work :) |
ci: Add macos runner
Hey @philss, how much work do you think it would take to make it work in 0.7? |
@lambdaofgod I think the representation part of lists is the easiest part, which is what I did in this PR. The real problem is to support operations with lists. We are postponing this since last year because of the unknown complexity of maintaining the lists dtypes. But I think we are near to start looking into this again.
Short answer is: we don't know yet 😅 |
This is work-in-progress and adds the bases of "list" dtypes.
It also adds the
to_list/1
lazy operation that takes a series and creates a list series with the elements of that series. This is useful for aggregations, when you want to capture the elements of a given group. Eg.:The result is going to be something like this:
Related to: