Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explain column-independent operations #3225

Merged
merged 13 commits into from
Dec 1, 2022
Merged

Explain column-independent operations #3225

merged 13 commits into from
Dec 1, 2022

Conversation

bkamins
Copy link
Member

@bkamins bkamins commented Nov 16, 2022

This PR adds additional explanation of context dependent expressions in operation specification syntax to make sure users know exactly how they work.

CC @yjunechoe @pdeffebach

@bkamins bkamins requested a review from nalimilan November 16, 2022 18:53
@bkamins bkamins added the doc label Nov 16, 2022
@bkamins bkamins added this to the patch milestone Nov 16, 2022
@bkamins
Copy link
Member Author

bkamins commented Nov 17, 2022

I additionally expanded a section on indexing and iterating a grouped data frame.

Copy link
Member

@nalimilan nalimilan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I've made many small comments.

I wonder whether "context-dependent" is explicit enough. The operation specification syntax is already context-dependent as it doesn't work outside of combine/select/transform. And any function call can be thought as context-dependent as the result can vary a lot depending on what columns are present in the data, on whether a DataFrame or a GroupedDataFrame is passed... Not sure what a better term could be: maybe something which stresses that these look like function calls but are actually not? "pseudo-functions"? "dummy functions"? mock-functions"? Maybe there's already a term for this kind of pattern in other languages?

docs/src/man/split_apply_combine.md Outdated Show resolved Hide resolved
docs/src/man/split_apply_combine.md Outdated Show resolved Hide resolved
src/abstractdataframe/selection.jl Outdated Show resolved Hide resolved
src/abstractdataframe/selection.jl Outdated Show resolved Hide resolved
docs/src/man/split_apply_combine.md Outdated Show resolved Hide resolved
docs/src/man/split_apply_combine.md Outdated Show resolved Hide resolved
docs/src/man/split_apply_combine.md Outdated Show resolved Hide resolved
docs/src/man/split_apply_combine.md Outdated Show resolved Hide resolved
docs/src/man/split_apply_combine.md Outdated Show resolved Hide resolved
docs/src/man/split_apply_combine.md Outdated Show resolved Hide resolved
@bkamins
Copy link
Member Author

bkamins commented Nov 21, 2022

I wonder whether "context-dependent" is explicit enough.

This term is used in dplyr https://dplyr.tidyverse.org/reference/context.html.

I tried to think about a better name, but I really did not find a great proposal. I will ask on Slack for comments.

@bkamins
Copy link
Member Author

bkamins commented Nov 22, 2022

The alternative names proposed on Slack are:

  • column-independent operations
  • column-independent expressions
  • context-aware operations
  • non-column sources (or non-column operations)

@bkamins bkamins changed the title Explain context dependent expressions Explain column-independent operations Nov 23, 2022
@bkamins
Copy link
Member Author

bkamins commented Nov 23, 2022

The PR is updated to use "column-independent operations" as it seems the least controversial option.

@bkamins
Copy link
Member Author

bkamins commented Nov 26, 2022

@nalimilan - OK to merge this PR? Thank you!

docs/src/man/split_apply_combine.md Outdated Show resolved Hide resolved
Comment on lines 638 to 642
julia> [nrow(sdf) for sdf in gd]
3-element Vector{Int64}:
50
50
50
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still relevant.

docs/src/man/split_apply_combine.md Outdated Show resolved Hide resolved
docs/src/man/split_apply_combine.md Outdated Show resolved Hide resolved
Co-authored-by: Milan Bouchet-Valat <[email protected]>
@bkamins
Copy link
Member Author

bkamins commented Nov 29, 2022

Still relevant.

@nalimilan - what do you mean here? Could you please clarify?

(GitHub does not show me what you refer to). You wanted me to put together all examples with printing number of rows for all groups in one place and I did this.

docs/src/man/split_apply_combine.md Outdated Show resolved Hide resolved
docs/src/man/split_apply_combine.md Outdated Show resolved Hide resolved
docs/src/man/split_apply_combine.md Outdated Show resolved Hide resolved
docs/src/man/split_apply_combine.md Outdated Show resolved Hide resolved
docs/src/man/split_apply_combine.md Outdated Show resolved Hide resolved
docs/src/man/split_apply_combine.md Outdated Show resolved Hide resolved
docs/src/man/split_apply_combine.md Outdated Show resolved Hide resolved
docs/src/man/split_apply_combine.md Outdated Show resolved Hide resolved
docs/src/man/split_apply_combine.md Outdated Show resolved Hide resolved
docs/src/man/split_apply_combine.md Outdated Show resolved Hide resolved
docs/src/man/split_apply_combine.md Outdated Show resolved Hide resolved
docs/src/man/split_apply_combine.md Outdated Show resolved Hide resolved
Co-authored-by: Milan Bouchet-Valat <[email protected]>
@bkamins bkamins merged commit 6fab523 into main Dec 1, 2022
@bkamins bkamins deleted the bk/man_update branch December 1, 2022 12:07
@bkamins
Copy link
Member Author

bkamins commented Dec 1, 2022

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants