-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow multicolumn transformations for AbstractDataFrame #2461
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that's impressive. I haven't looked at the tests in detail yet, feel free to point me at interesting cases that I may have missed.
Could you please clarify what |
it is a type. types in Julia are also values, and we use this fact here
The first is instance of a type the second is a type (
What do you mean by "pattern"? In general the transformation mini-language is DataFrames.jl specific. What is important is that
We do not dispatch on type (actually if you look at the implementation there is a problem with this - we have to dynamically check for
We could use In summary: it is not a common pattern, but do you have a better proposal what to use instead? The benefit of this approach is:
(and just to stress - we do not dispatch on This situation is kind-of similar to |
As an aferthought: we could use
and it would be a valid transformation specification (note that there is no need of parens for trailing |
Thank you for the clarification. I'm glad I have understanding of your thought process, here. I think |
@nalimilan - what do you think? I dislike EDIT: sorry, actually |
I prefer |
Let us keep |
Sounds good. Perhaps the best mental model is for it to be a |
Co-authored-by: Milan Bouchet-Valat <[email protected]>
Is the following expected behavior?
I would have thought with the |
Additionally, should the following work?
|
No - this would be
What is
This is also expected - and follows your request to disallow In general |
Thanks, this is all very clear.
Yes that is expected. This is a really impressive work! Really appreciate it and the thought you've put into this. |
I think a |
Co-authored-by: Milan Bouchet-Valat <[email protected]>
Why allow returning matrices at all if we are deprecating the |
Only for backward compatibility reasons. Note that we will not disallow returning them. The only question is what happens with them and we have two options:
I was thinking which behavior the user would prefer when returning a matrix and I thought that the second is more natural. Would you prefer the first? In general - under current rules the only case when we throw an error is Note that this is a different case from what we discuss with @nalimilan, as he has raised a case when |
The first option reminds me of |
So option 2 is what we currently have 😄. |
Co-authored-by: Milan Bouchet-Valat <[email protected]>
I have updated the documentation (so essentially when we accept this this should be good to be merged). @nalimilan - as usual - feel free to rewrite the docstrings 😄 (and sorry for mistakes, as for sure there will be some). |
Co-authored-by: Milan Bouchet-Valat <[email protected]>
Co-authored-by: Milan Bouchet-Valat <[email protected]>
Thank you for all the comments. If there are no more issues with this proposal I will merge the PR tomorrow and follow up with a small |
Thank you! |
This PR partially addresses #2410 and #2457.
It covers
select
etc. forAbstractDataFrame
.If we are OK with the functionality I will update the documentation.
TODO:
select
etc. forGroupedDataFrame
(this will be a separate PR to keep PRs more atomic)ByRow
with no columns passed tofilter
(also a separate PR)CC @nalimilan @pdeffebach @matthieugomez - this is a rather complex PR so independent testing (especially for corner cases) would be welcome (if you would have suggestions for types of tests to add please comment and I will add them).