Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mention DataFrameMacros.jl in the docs #3195

Merged
merged 7 commits into from
Oct 20, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/Project.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
[deps]
CSV = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b"
CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597"
Chain = "8be319e6-bccf-4806-a6f7-6fae938471bc"
DataFrameMacros = "75880514-38bc-4a95-a458-c2aea5a3a702"
DataFramesMeta = "1313f7d8-7da2-5740-9ea0-a2ca25f37964"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
Missings = "e1d29d7a-bbdc-5cf2-9ac0-f12de2c33e28"
Expand Down
5 changes: 4 additions & 1 deletion docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,12 +115,15 @@ integrated they are with DataFrames.jl.
A range of convenience functions for DataFrames.jl that augment `select` and
`transform` to provide a user experience similar to that provided by
[dplyr](https://dplyr.tidyverse.org/) in R.
- [DataFrameMacros.jl](https://github.com/jkrumbiegel/DataFrameMacros.jl):
Provides macro versions of the common DataFrames.jl functions similar to DataFramesMeta.jl,
with convenient syntax for the manipulation of multiple columns at once.
- [Query.jl](https://github.com/queryverse/Query.jl): Query.jl provides a single
framework for data wrangling that works with a range of libraries, including
DataFrames.jl, other tabular data libraries (more on those below), and even
non-tabular data. Provides many convenience functions analogous to those in
dplyr in R or [LINQ](https://en.wikipedia.org/wiki/Language_Integrated_Query).
- You can find more on both of these packages in the
- You can find more information on these packages in the
[Data manipulation frameworks](@ref) section of this manual.
- **And More!**
- [Graphs.jl](https://github.com/JuliaGraphs/Graphs.jl): A pure-Julia,
Expand Down
82 changes: 80 additions & 2 deletions docs/src/man/querying_frameworks.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Data manipulation frameworks

Two popular frameworks provide convenience methods to manipulate `DataFrame`s:
DataFramesMeta.jl and Query.jl. They implement a functionality similar to
Three frameworks provide convenience methods to manipulate `DataFrame`s:
DataFramesMeta.jl, DataFrameMacros.jl and Query.jl. They implement a functionality similar to
[dplyr](https://dplyr.tidyverse.org/) or
[LINQ](https://en.wikipedia.org/wiki/Language_Integrated_Query).

Expand Down Expand Up @@ -117,6 +117,84 @@ julia> @chain df begin
You can find more details about how this package can be used on the
[DataFramesMeta.jl GitHub page](https://github.com/JuliaData/DataFramesMeta.jl).

## DataFrameMacros.jl

[DataFrameMacros.jl](https://github.com/jkrumbiegel/DataFrameMacros.jl) is
an alternative to DataFramesMeta.jl with an additional focus on convenient
solutions for the transformation of multiple columns at once.
The instructions below are for version 0.3 of DataFrameMacros.jl.

First, install the DataFrameMacros.jl package:

```julia
using Pkg
Pkg.add("DataFrameMacros")
```

In DataFrameMacros.jl, all but the `@combine` macro are row-wise by default.
There is also a `@groupby` which allows creating grouping columns on the fly
using the same syntax as `@transform`, for grouping by new columns
without writing them out twice.

In the example below, you can also see some of DataFrameMacros.jl's multi-column
features, where `mean` is applied to both age columns at once by selecting
them with the `r"age"` regex. The new column names are then derived using the
`"{}"` shortcut which splices the transformed column names into a string.

```jldoctest dataframemacros
julia> using DataFrames, DataFrameMacros, Chain, Statistics

julia> df = DataFrame(name=["John", "Sally", "Roger"],
age=[54.0, 34.0, 79.0],
children=[0, 2, 4])
3×3 DataFrame
Row │ name age children
│ String Float64 Int64
─────┼───────────────────────────
1 │ John 54.0 0
2 │ Sally 34.0 2
3 │ Roger 79.0 4

julia> @chain df begin
@transform :age_months = :age * 12
@groupby :has_child = :children > 0
@combine "mean_{}" = mean({r"age"})
end
2×3 DataFrame
Row │ has_child mean_age mean_age_months
│ Bool Float64 Float64
─────┼──────────────────────────────────────
1 │ false 54.0 648.0
2 │ true 56.5 678.0
```

There's also the capability to reference a group of multiple columns as a single unit,
for example to run aggregations over them, with the `{{ }}` syntax.
In the following example, the first quarter is compared to the maximum of the other three:

```jldoctest dataframemacros
julia> df = DataFrame(q1 = [12.0, 0.4, 42.7],
q2 = [6.4, 2.3, 40.9],
q3 = [9.5, 0.2, 13.6],
q4 = [6.3, 5.4, 39.3])
3×4 DataFrame
Row │ q1 q2 q3 q4
│ Float64 Float64 Float64 Float64
─────┼────────────────────────────────────
1 │ 12.0 6.4 9.5 6.3
2 │ 0.4 2.3 0.2 5.4
3 │ 42.7 40.9 13.6 39.3

julia> @transform df :q1_best = :q1 > maximum({{Not(:q1)}})
3×5 DataFrame
Row │ q1 q2 q3 q4 q1_best
│ Float64 Float64 Float64 Float64 Bool
─────┼─────────────────────────────────────────────
1 │ 12.0 6.4 9.5 6.3 true
2 │ 0.4 2.3 0.2 5.4 false
3 │ 42.7 40.9 13.6 39.3 true
```

## Query.jl

The [Query.jl](https://github.com/queryverse/Query.jl) package provides advanced
Expand Down
3 changes: 3 additions & 0 deletions docs/src/man/working_with_dataframes.md
Original file line number Diff line number Diff line change
Expand Up @@ -738,6 +738,9 @@ operations:
- the [DataFramesMeta.jl](https://github.com/JuliaStats/DataFramesMeta.jl)
package provides interfaces similar to LINQ and
[dplyr](https://dplyr.tidyverse.org)
- the [DataFrameMacros.jl](https://github.com/jkrumbiegel/DataFrameMacros.jl)
package provides macros for most standard functions from DataFrames.jl,
with convenient syntax for the manipulation of multiple columns at once.

See the [Data manipulation frameworks](@ref) section for more information.

Expand Down