-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecate head and tail in favor of first and last #1607
Conversation
Now that we have decided that data frames are collections of rows, there is no point in defining separate head and tail functions, which are a legacy from the R API. Do not define one-argument methods defaulting to n=6 since other first and last methods in Base default to n=1, which should return a single row (as a NamedTuple or DataFrameRow).
There are two points to discuss: Issue 1. Do we really want to get rid of Issue 2. Actually if we want to do this change we should also define EDIT especially as in Base |
Also note that in In stata when you first get a dataset you write But removing |
Good catch about
That's not generally an argument we have considered as legitimate on its own, as consistency with other languages would make Julia inconsistent with itself. When several options are mostly equivalent, taking the one which is consistent with other languages is a good idea, but when something already exists in Julia duplicating features just to help users of language X makes the API bigger, and what happens when somebody coming from language Y wants to add another name for that function? |
agreed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments that I would prefer being resolved before accepting. Also when you make a question/PR to base regarding first
/last
with n
parameter could you please add a reference here so I can track it?
**Arguments** | ||
Get the first row of `df` as a `DataFrameRow`. | ||
""" | ||
Base.first(df::AbstractDataFrame) = df[1, :] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will throw an error on 0-row data frame. Maybe some better error message would be in place. Note that first(df, 1)
will work on 0
-row data frame. Same with last
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've followed what happens with vectors. I think that makes sense. The error message isn't too bad, it indicates you tried to access row 1 in a 0-row data frame.
**Result** | ||
Get a data frame with the `n` first rows of `df`. | ||
""" | ||
Base.first(df::AbstractDataFrame, n::Integer) = df[1:min(n,nrow(df)), :] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it would be better to be a bit breaking and return a SubDataFrame
? This would be consistent with first(df)
which also returns a view now. I do not have very strong preference here over DataFrame
so @nalimilan - please decide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. first(df)
returns a view only because df[1, :]
does, so I'd say it's consistent to also use df[1:i, :]
here, and therefore return a copy.
Yeah that's a good point. It's important not to keep too many legacy APIs for sure. It's also worth nothing that i've never found |
I think we can merge this |
Now that we have decided that data frames are collections of rows (#1514), there is no point in defining separate
head
andtail
functions, which are a legacy from the R API. Do not define one-argument methods defaulting ton=6
since otherfirst
andlast
methods in Base default ton=1
, which should return a single row (as aNamedTuple
orDataFrameRow
).