Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mention DataFrameMacros.jl in the docs #3195

Merged
merged 7 commits into from
Oct 20, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/Project.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
[deps]
CSV = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b"
CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597"
Chain = "8be319e6-bccf-4806-a6f7-6fae938471bc"
DataFrameMacros = "75880514-38bc-4a95-a458-c2aea5a3a702"
DataFramesMeta = "1313f7d8-7da2-5740-9ea0-a2ca25f37964"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
Missings = "e1d29d7a-bbdc-5cf2-9ac0-f12de2c33e28"
Expand Down
5 changes: 4 additions & 1 deletion docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,10 @@ integrated they are with DataFrames.jl.
DataFrames.jl, other tabular data libraries (more on those below), and even
non-tabular data. Provides many convenience functions analogous to those in
dplyr in R or [LINQ](https://en.wikipedia.org/wiki/Language_Integrated_Query).
- You can find more on both of these packages in the
- [DataFrameMacros.jl](https://github.com/jkrumbiegel/DataFrameMacros.jl):
jkrumbiegel marked this conversation as resolved.
Show resolved Hide resolved
Provides macro versions of the common DataFrames functions similar to DataFramesMeta,
jkrumbiegel marked this conversation as resolved.
Show resolved Hide resolved
with convenient syntax for the manipulation of multiple columns at once.
- You can find more information on these packages in the
[Data manipulation frameworks](@ref) section of this manual.
- **And More!**
- [Graphs.jl](https://github.com/JuliaGraphs/Graphs.jl): A pure-Julia,
Expand Down
82 changes: 80 additions & 2 deletions docs/src/man/querying_frameworks.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Data manipulation frameworks

Two popular frameworks provide convenience methods to manipulate `DataFrame`s:
DataFramesMeta.jl and Query.jl. They implement a functionality similar to
Three frameworks provide convenience methods to manipulate `DataFrame`s:
DataFramesMeta.jl, Query.jl and DataFrameMacros.jl. They implement a functionality similar to
[dplyr](https://dplyr.tidyverse.org/) or
[LINQ](https://en.wikipedia.org/wiki/Language_Integrated_Query).

Expand Down Expand Up @@ -247,3 +247,81 @@ These examples only scratch the surface of what one can do with
referred to the [Query.jl
documentation](http://www.queryverse.org/Query.jl/stable/) for more
information.

## DataFrameMacros.jl
jkrumbiegel marked this conversation as resolved.
Show resolved Hide resolved

[DataFrameMacros.jl](https://github.com/jkrumbiegel/DataFrameMacros.jl) is
an alternative to `DataFramesMeta.jl` with an additional focus on convenient
jkrumbiegel marked this conversation as resolved.
Show resolved Hide resolved
solutions for the transformation of multiple columns at once.
The instructions below are for version 0.3 of DataFrameMacros.jl.

First, install the DataFrameMacros.jl package:

```julia
using Pkg
Pkg.add("DataFrameMacros")
```

In DataFrameMacros.jl, all but the `@combine` macro are row-wise by default.
There is also a `@groupby` that works like a `@transform` with `groupby` together,
for grouping by new columns without writing them out twice.

In the below example, you can also see some of DataFrameMacros' multi-column
jkrumbiegel marked this conversation as resolved.
Show resolved Hide resolved
features, where `mean` is applied to both age columns at once by selecting
them with the `r"age"` regex. The new column names are then derived using the
`"{}"` shortcut which splices the transformed column names into a string.

```jldoctest dataframemacros
using DataFrames, DataFrameMacros, Chain, Statistics
jkrumbiegel marked this conversation as resolved.
Show resolved Hide resolved

julia> df = DataFrame(name=["John", "Sally", "Roger"],
age=[54.0, 34.0, 79.0],
children=[0, 2, 4])
3×3 DataFrame
Row │ name age children
│ String Float64 Int64
─────┼───────────────────────────
1 │ John 54.0 0
2 │ Sally 34.0 2
3 │ Roger 79.0 4

julia> @chain df begin
@transform :age_months = :age * 12
@groupby :has_child = :children > 0
@combine "mean_{}" = mean({r"age"})
end
2×3 DataFrame
Row │ has_child mean_age mean_age_months
│ Bool Float64 Float64
─────┼──────────────────────────────────────
1 │ false 54.0 648.0
2 │ true 56.5 678.0
```

There's also the capability to reference a group of multiple columns as a single unit,
for example to run aggregations over them, with the `{{ }}` syntax.
In the following example, the first quarter is compared to the maximum of the other three:

```jldoctest dataframemacros
julia> df = DataFrame(
q1 = [12.0, 0.4, 42.7],
q2 = [6.4, 2.3, 40.9],
q3 = [9.5, 0.2, 13.6],
q4 = [6.3, 5.4, 39.3])
3×4 DataFrame
Row │ q1 q2 q3 q4
│ Float64 Float64 Float64 Float64
─────┼────────────────────────────────────
1 │ 12.0 6.4 9.5 6.3
2 │ 0.4 2.3 0.2 5.4
3 │ 42.7 40.9 13.6 39.3

julia> @transform df :q1_best = :q1 > maximum({{Not(:q1)}})
3×5 DataFrame
Row │ q1 q2 q3 q4 q1_best
│ Float64 Float64 Float64 Float64 Bool
─────┼─────────────────────────────────────────────
1 │ 12.0 6.4 9.5 6.3 true
2 │ 0.4 2.3 0.2 5.4 false
3 │ 42.7 40.9 13.6 39.3 true
```
3 changes: 3 additions & 0 deletions docs/src/man/working_with_dataframes.md
Original file line number Diff line number Diff line change
Expand Up @@ -738,6 +738,9 @@ operations:
- the [DataFramesMeta.jl](https://github.com/JuliaStats/DataFramesMeta.jl)
package provides interfaces similar to LINQ and
[dplyr](https://dplyr.tidyverse.org)
- the [DataFrameMacros.jl](https://github.com/jkrumbiegel/DataFrameMacros.jl)
package provides macros for most standard functions from DataFrames.jl,
with convenient syntax for the manipulation of multiple columns at once.

See the [Data manipulation frameworks](@ref) section for more information.

Expand Down