-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Column manipulation API cleanup #239
Comments
+1 for a cleanup common to JuliaDB and DataFrames (as much as possible at least). Regarding |
+💯 for We might want to do away with any kind of drop column operation and let people rely on
That's something that didn't exist when |
So as a first deprecation PR I could go for:
I hadn't thought about Question: for EDIT: actually Then we could have a separate PR with possible renamings of |
I have a couple of API doubts:
|
Wow that was fast! |
Discussing the DataFrames API over at JuliaData/DataFrames.jl#1695 (comment), I got the impression that our column manipulation developed organically over time and may need a little bit of clean-up.
Specifically:
Problem:
pushcol
andsetcol
seem largely overlapping (if the name is new insetcol
the column gets added). The methodsetcol(table, name, col)
andsetcol(table, name => col)
are redundant and the multi argument metodsetcol(table, pair1, pair2, pair3)
can be a bit confusing combined withsetcol(table, name, col)
. The name is a bit puzzling assetcol
is singular but can be applied to many columns: it seems like the preference in Julia in this case is to go with the plural (saysum(v, dims = i)
). Also, it's inconsistent that the getter / setter for columns arecolumn
/setcol
.Proposed solution: Deprecate both in favor of
mutate(table, changes::Pair...)
. This is consistent with thereplace(v, old, new)
toreplace(v, old => new)
deprecation in Julia Base and with DataFrames intendedmutate!
syntax. Here the corresponding JuliaDBMeta macro, called@transform
, should probably become@mutate
. Related to this, we could also consider adding amergecols
to merge columns of two tables.Problem:
popcol
is a poor name for various reasons. Unlikepop!
, it does not return the item it takes out of the collection, the singular is odd as it allows to remove many columns, unlikepop!
one can specify which column to remove.Proposed solution: rename to
deletecols
(consistent with DataFrames: JuliaData/DataFrames.jl#1772), stop defaulting to last column.deletecols(t, ncols(t))
is not that bad and it's not clear to me why it should be more common to delete the last column instead of any other. If we want special syntax for that we may add a special selectorLast()
(along the lines ofKeys()
) but I don't think that's necessary.renamecol
also has similar issues as above and I think the correct signature should berenamecols(table, changes:::Pair...)
(DataFrames usesrename
, but I think it's nice to be consistent and use...cols
for everything).With
insertcol
and friends I would also rename toinsertcols
and friends (similar toDataFrames.insertcols!
) and again only support thename => col
interface (like DataFrames does). For exampleinsertcols(table, ind, name => v)
orinsertcolsbefore(table, col, name => v)
.cc: @bkamins and @nalimilan
The text was updated successfully, but these errors were encountered: