-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataFrame constructors #1599
Comments
Makes sense, but couldn't the Tables.jl default constructor handle this automatically? Cc: @quinnj |
We could add appropriate methods following Tables.jl interface to handle them. The only place where this is not possible is |
Given a discussion in #1646 (comment) maybe we should actually have the following list of possible additional constructors of
@nalimilan - this is what I have on a long-list of possible new constructors. None of them is crucial (so I am OK with not adding any of them - then I would close this issue), but please comment on all of them so that we can stabilize the API here. |
1 and 2 sound OK. Though Tables.jl should ideally support a collection of I'm more hesitant about 3 since that creates yet another kind of named collection, but why not. |
Regardind 1. This is something that would be left to decide if we want to go on with this (given the comments below). The problem with this I see now is that Tables.jl by default assumes that iterable is accessed row-wise, not column wise, and in general it is impossible to tell without consuming it if it contains only vectors. So I am not sure and maybe we should drop this and stay only with vector of vectors that we have now? Regarding 2. Actually we have it (for free) via Tables.jl. So this is a non-issue Regarding 3. Actually now I have noticed we have a problem here:
This behavior is inherited from Queryverse. So I would drop it. |
By:
I mean that I would leave it as is for now. |
I think that is a Tables.jl thing, not Queryverse.jl/TableTraits.jl. TableTraits.jl only considers iterators of named tuples to be tables, so it wouldn't treat an array of |
@davidanthoff - thank you for the explanation. Summing up this discussion + #1646 (comment): the only thing I will add is a constructor taking a tuple of vectors to create a data frame (this currently throws an error so it will be non-breaking). |
Case 3 is indeed annoying if we wanted to use it for column-wise operations later. Indeed it's very similar to the equivalent varargs constructor, which is column-wise. Maybe better support it to avoid the inconsistency (or at least have it throw an error). Basically we just need to decide on a list of exceptions to the fallback Tables.jl constructor, which treats everything as row-wise. Are there any other cases which are not covered here? |
Regarding Case 1 I have opened #1717. Regarding Case 3: This would be breaking (that is why I hesitated), but for sure what we have now makes no sense, and we should handle |
Actually rule 3 kicks in in a natural operation:
while
|
I am closing this as we have all I have requested now via Tables.jl:
@quinnj (if I am missing something here please reopen otherwise please comment if it is working as expected - thank you!) |
I am planning a small cleanup of
DataFrame
constructors and one significant change I am considering is to add a constructor forDataFrame(::AbstractVector{Pair{Symbol, T}};makeunique) where T<:AbstractVector
orDataFrame(::AbstractVector{Pair{Symbol}};makeunique)
(I am not 100% sure which would be better - in the second we would cast the second element of pair toAbstractVector
inside the constructor).This change would complement the existing constructors:
DataFrame(pairs::Pair{Symbol,<:Any}...; makeunique)
DataFrame(; kwargs...)
(actually this constructor would be made more efficient as we would avoid splatting inside it)and would work nicely with the fact that now
eachcol
returns such anAbstractVector
.Similarly - do we thing that a constructor accepting
AbstractVector{<:DataFrameRow}
would be good? (this would complementeachrow
and is similar to anAbstractVector
ofNamedTuple
s)CC @nalimilan
The text was updated successfully, but these errors were encountered: