-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Advanced transformation examples #3433
Conversation
@nalimilan - what do you think (CI errors are unrelated)? |
The reason why this is needed is that instead `combine` iterates the contents of the value returned | ||
by the operation specification function and tries to expand it, which in our case is a tuple of numbers, | ||
so one gets an error: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason why this is needed is that instead `combine` iterates the contents of the value returned | |
by the operation specification function and tries to expand it, which in our case is a tuple of numbers, | |
so one gets an error: | |
Without `Ref`, `combine` iterates the contents of the value returned by the operation specification function, which in our case is a tuple of numbers, and tries to expand it, so one gets an error: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remind me why this isn't allowed? AFAICT it would be possible to destructure the tuple and use its values to fill the columns? BTW, is "expand" the right word here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason is related to the issue we recently discussed with the Tables.jl conflict against Julia 1.11. DataFrames.jl does not recognize Tuple
as a valid input so it sends it to Tables.jl columntable
function. So essentially - the error is because we fall-back to Tables.jl in cases that we do not explicitly handle as special cases (like vectors, matrices, DataFrame
s etc.).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your suggestion is applied in the commit I pushed (I just had to re-word it a bit).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thanks. But then, when combine
iterates, it expects each value to represent a column, not a row, and my suggestion is incorrect, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just have written something more precise as your suggestion was not precise enough (I think). What is done is I think best visible here:
julia> df = DataFrame(x=1:2)
2×1 DataFrame
Row │ x
│ Int64
─────┼───────
1 │ 1
2 │ 2
julia> combine(df, :x => (x -> ((a=x,), (a=2x,), (a=3x,))) => AsTable)
3×1 DataFrame
Row │ a
│ Array…
─────┼────────
1 │ [1, 2]
2 │ [2, 4]
3 │ [3, 6]
The returned tuple ((a=x,), (a=2x,), (a=3x,))
has three elements so it produces three rows. Things work, because each element is a NamedTuple
which provides column names that can be used to generate column names with AsTable
.
Is it clearer now?
@nalimilan - I applied all your suggestions and expanded the examples even more to be more explicit. |
The reason why this is needed is that instead `combine` iterates the contents of the value returned | ||
by the operation specification function and tries to expand it, which in our case is a tuple of numbers, | ||
so one gets an error: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thanks. But then, when combine
iterates, it expects each value to represent a column, not a row, and my suggestion is incorrect, right?
Co-authored-by: Milan Bouchet-Valat <[email protected]>
@nalimilan - can you please have a look at it. If it is OK we could merge it. Thank you! |
Co-authored-by: Milan Bouchet-Valat <[email protected]>
Thank you! |
Fixes #3430