Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BREAKING] Make DataFrameColumns stop being an AbstractVector #2291

Merged
merged 13 commits into from
Jun 24, 2020

Conversation

bkamins
Copy link
Member

@bkamins bkamins commented Jun 16, 2020

Fixes #2229
I have added functionalities for DataFrameColumns that I find useful, but please comment if you would like to see more.

@bkamins bkamins changed the title Make DataFrameColumns stop being an AbstractVector [BREAKING] Make DataFrameColumns stop being an AbstractVector Jun 16, 2020
@bkamins bkamins added breaking The proposed change is breaking. feature labels Jun 16, 2020
@bkamins bkamins added this to the 1.0 milestone Jun 16, 2020
@bkamins
Copy link
Member Author

bkamins commented Jun 16, 2020

@nalimilan - it should be good to have a look at.

@bkamins
Copy link
Member Author

bkamins commented Jun 16, 2020

As usual some corner cases of the changes popped up.
Good change:
DataFrame(eachcol(df)) is now no-op (except copying of columns), previously it dropped column names.
Bad change (but maybe not that bad): eachcol(df)[1,1] now errors as it would produce duplicate column name.

test/iteration.jl Outdated Show resolved Hide resolved
Copy link
Member

@quinnj quinnj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; just the use of [begin] causing 1.0 test failures

@bkamins
Copy link
Member Author

bkamins commented Jun 16, 2020

Thank you for having a look at it!

@rasmushenningsson
Copy link

This looks great!

src/abstractdataframe/iteration.jl Outdated Show resolved Hide resolved
src/abstractdataframe/iteration.jl Outdated Show resolved Hide resolved
src/abstractdataframe/iteration.jl Show resolved Hide resolved
docs/src/lib/types.md Outdated Show resolved Hide resolved
src/abstractdataframe/iteration.jl Outdated Show resolved Hide resolved
src/abstractdataframe/iteration.jl Outdated Show resolved Hide resolved
src/abstractdataframe/iteration.jl Show resolved Hide resolved
src/subdataframe/subdataframe.jl Show resolved Hide resolved
src/abstractdataframe/iteration.jl Show resolved Hide resolved
test/select.jl Show resolved Hide resolved
Co-authored-by: Milan Bouchet-Valat <[email protected]>
@bkamins
Copy link
Member Author

bkamins commented Jun 21, 2020

@nalimilan - your questions were exactly the things I had to check against Base when implementing it. These interfaces are really non-obvious.

src/abstractdataframe/iteration.jl Outdated Show resolved Hide resolved
src/abstractdataframe/iteration.jl Outdated Show resolved Hide resolved
src/abstractdataframe/iteration.jl Outdated Show resolved Hide resolved
src/abstractdataframe/iteration.jl Outdated Show resolved Hide resolved
src/abstractdataframe/iteration.jl Outdated Show resolved Hide resolved
src/abstractdataframe/iteration.jl Outdated Show resolved Hide resolved
test/select.jl Show resolved Hide resolved
@bkamins
Copy link
Member Author

bkamins commented Jun 24, 2020

So the major point of discussion is this case:

julia> x = Any[[1,2,3], [1,2,3]]
2-element Array{Any,1}:
 [1, 2, 3]
 [1, 2, 3]

julia> collect(x)
2-element Array{Any,1}:
 [1, 2, 3]
 [1, 2, 3]

As you can see - I have implemented methods to follow what happens in Base with vectors. We can change it to follow your proposal though. What do you think?

Co-authored-by: Milan Bouchet-Valat <[email protected]>
@nalimilan
Copy link
Member

OK, makes sense. Then the only remaining issue to settle is whether to create a SubDataFrame parent with multi-indexing.

@bkamins
Copy link
Member Author

bkamins commented Jun 24, 2020

You convinced me to return a DataFrame :). I am working on it now and on more flexible firstindex per your comment

src/abstractdataframe/iteration.jl Outdated Show resolved Hide resolved
src/abstractdataframe/iteration.jl Outdated Show resolved Hide resolved
src/abstractdataframe/iteration.jl Outdated Show resolved Hide resolved
Co-authored-by: Milan Bouchet-Valat <[email protected]>
@bkamins bkamins merged commit 0412291 into JuliaData:master Jun 24, 2020
@bkamins bkamins deleted the eachcol_not_vector branch June 24, 2020 20:38
@bkamins
Copy link
Member Author

bkamins commented Jun 24, 2020

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking The proposed change is breaking. feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

findfirst/findlast/nextind/prevind not working for eachcol in v0.21.0
4 participants