-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split off list constructor logic from pl.concat_list
into pl.list
#8510
Comments
Fully agree. Will add this to the milestones for the next breaking release. |
Hi @stinodego - I just read your comment regarding horizontal sum which made me think of this issue. After the concat_list/namespace rename
Would become
Your comment made me wonder if it would be possible to provide a syntax sugar version:
sugar:
no-sugar:
|
Quick thoughts on this: i like the idea of aligning the methods name between structs and lists. What about keeping concat_list so we have And then rename struct to concat_struct so that we have i believe that concat word makes it clearer that we are concatenating multiple rows together thoughts ? |
I am also a bit on the fence about this. On the Rust side, |
Yes, I agree that we miss an (I don't know the english name of this, adjective? verb?), but something that says "what" we are going to do with the noun "list". We already have this on many places.
But the noun |
We have a set of functions that take one or more expressions and combines them to generate a new column of a certain datatype. We should rename these to reflect that they are essentially doing the same thing. I have given it a lot of thought and I think the
* Convenience wrapper around I plan to do the following:
|
hi @stinodego, Would it make sense to add an ‘as_array’ method? Same logic as ´as_list’ but with the result being an array |
Yes, it's already up on my whiteboard 😄 |
I use With a seemingly small number of assumptions, this operation will always result in a fixed size list. It would be very convenient to be able to have the resulting concatenation be an array. |
That won't be possible, as import polars as pl
df = pl.DataFrame({"a": [[1, 2], [3], [3, 4]], "b": [1, 2, 3]})
result = df.with_columns(pl.concat_list("a", "b").alias("concat"))
We will add an |
pl.concat_list
to pl.list
pl.concat_list
We have discussed this, and there is a distinction to be made between the functions listed. There are "constructor" functions:
Those are fine as-is. Perhaps a prefix like Then there are the "concatenate" functions:
These names are clear and we don't want to change them. The real issue here is that pl.select(pl.concat_list(1, 2, 3))
pl.select(pl.concat_list([1], [2], [3]))
The second one is clearly concatenating the inputs, while the first one is constructing a new list and taking the inputs as elements to that list. We should add a new |
pl.concat_list
pl.concat_list
into pl.list
Great! Let me try to implement the proposal above. |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
Now that the supertype issue has been fixed, I no longer think we really need to split up
This is perfectly usable and I don't think splitting up the function is going to really help anyone. It is technically a bit overloaded but that's more of a technicality, in my opinion. I added some doc examples to clarify this behavior. With that, this can be closed, in my opinion. I will consider re-opening this if I hear some strong arguments for another approach here. |
I want to argue that we want separate functions for In ibis, we transpile ibis expressions into polars expressions. One of our rules is to turn an ibis.Array into a polars.List:
@translate.register(ops.Array)
def array_column(op, **kw):
cols = [translate(col, **kw) for col in op.exprs]
if op.dtype.is_array():
cols = [c.implode() for c in cols]
return pl.concat_list(cols) In an effort very similar to this issue, I am trying to overhaul array construction in ibis-project/ibis#9473. The API that I am pushing for there is
I advocate for polars to create a |
Just added a workaround for this in ibis: ibis-project/ibis#9484 |
Comment by @mcrumiller in the duplicated issue is relevant: Perhaps there is some merit to splitting the two after all. |
If you do decide to go this route, the following seems simple and obvious from a user perspective:
|
@mcrumiller, will any of these functions allow us to combine two (or more) |
I agree with @mcrumiller 's suggestions. |
@mcrumiller's API seems good to me. Reiterating that I think they should all accept |
@stinodego I just stumbled on this and saw you said you'd make it a milestone but I think that was before you had milestones setup in github so I made it a 2.0.0 milestone. |
I have added a draft PR (#19079) to discuss the design of a |
Problem description
Discussing #8503 sparked the thought - would it make sense to "rename"
pl.concat_list
topl.list
?pl.list
is currently marked for deprecation as part of the.arr
to.list
rename.Similar to how there is
pl.struct
and the.struct
namespace:If
pl.list
were repurposed it would be:The text was updated successfully, but these errors were encountered: