[ENH] Support dplyr 1.1.0~1.1.2 #189

pwwang · 2023-09-04T21:04:04Z

Feature Type

Adding new functionality to datar
Changing existing functionality in datar
Removing existing functionality in datar

Problem Description

Feature Description

*_join()
- A join specification can now be created through join_by(). This allows
  you to specify both the left and right hand side of a join using unquoted
  column names, such as join_by(sale_date == commercial_date). Join
  specifications can be supplied to any *_join() function as the by
  argument.
  
  Join specifications allow for new types of joins:
  - Equality joins: The most common join, specified by ==. For example,
    join_by(sale_date == commercial_date).
  - Inequality joins: For joining on inequalities, i.e.>=, >, <, and
    <=. For example, use join_by(sale_date >= commercial_date) to find
    every commercial that aired before a particular sale.
  - Rolling joins: For "rolling" the closest match forward or backwards when
    there isn't an exact match, specified by using the rolling helper,
    closest(). For example,
    join_by(closest(sale_date >= commercial_date)) to find only the most
    recent commercial that aired before a particular sale.
  - Overlap joins: For detecting overlaps between sets of columns, specified
    by using one of the overlap helpers: between(), within(), or
    overlaps(). For example, use
    join_by(between(commercial_date, sale_date_lower, sale_date)) to
    find commercials that aired before a particular sale, as long as they
    occurred after some lower bound, such as 40 days before the sale was made.
  - multiple is a new argument for controlling what happens when a row
    in x matches multiple rows in y. For equality joins and rolling joins,
    where this is usually surprising, this defaults to signalling a "warning",
    but still returns all of the matches. For inequality joins, where multiple
    matches are usually expected, this defaults to returning "all" of the
    matches. You can also return only the "first" or "last" match, "any"
    of the matches, or you can "error".
  - keep now defaults to NULL rather than FALSE. NULL implies
    keep = FALSE for equality conditions, but keep = TRUE for inequality
    conditions, since you generally want to preserve both sides of an
    inequality join.
  - unmatched is a new argument for controlling what happens when a row
    would be dropped because it doesn't have a match. For backwards
    compatibility, the default is "drop", but you can also choose to
    "error" if dropped rows would be surprising.
consecutive_id() for creating groups based on contiguous runs of the
same values
case_match() is a "vectorised switch" variant of case_when() that matches
on values rather than logical expressions. It is like a SQL "simple"
CASE WHEN statement, whereas case_when() is like a SQL "searched"
CASE WHEN statement
cross_join() is a more explicit and slightly more correct replacement for
using by = character() during a join
pick() makes it easy to access a subset of columns from the current group.
pick() is intended as a replacement for across(.fns = NULL), cur_data(),
and cur_data_all(). We feel that pick() is a much more evocative name when
you are just trying to select a subset of columns from your data.
symdiff() computes the symmetric difference.
cur_data() and cur_data_all() are soft-deprecated in favour of
pick()
across(), c_across(), if_any(), and if_all() now require the
_cols and _fns arguments. In general, we now recommend that you use
pick() instead of an empty across() call or across() with no _fns
(e.g. across(c(x, y)). (see also Quietly deprecate optional .cols and .fns cases tidyverse/dplyr#6523).
Passing **kwargs to across() is deprecated because it's ambiguous when
those arguments are evaluated. (see also Deprecate across(, ...) tidyverse/dplyr#6073).

Additional Context

No response

The text was updated successfully, but these errors were encountered:

pwwang added the enhancement New feature or request label Sep 4, 2023

pwwang changed the title ~~[ENH] Support dplyr 1.1.0~~ [ENH] Support dplyr 1.1.0~1.1.2 Sep 4, 2023

pwwang added a commit that referenced this issue Oct 5, 2023

🍱 Support dplyr up to 1.1.3 (#189)

4cb4039

pwwang added a commit to pwwang/datar-pandas that referenced this issue Oct 8, 2023

✨ Add symdiff (pwwang/datar#189)

87698b3

pwwang added a commit to pwwang/datar-pandas that referenced this issue Oct 8, 2023

✨ Add consecutive_id (pwwang/datar#189)

5e53517

pwwang added a commit to pwwang/datar-pandas that referenced this issue Oct 8, 2023

✨ Add pick and case_match (pwwang/datar#189)

1423bd2

pwwang added a commit to pwwang/datar-pandas that referenced this issue Oct 8, 2023

✨ Add cross_join (pwwang/datar#189)

07e73a5

pwwang closed this as completed Dec 7, 2023

pwwang mentioned this issue Feb 28, 2024

[QST] is asof join supported? #204

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] Support dplyr 1.1.0~1.1.2 #189

[ENH] Support dplyr 1.1.0~1.1.2 #189

pwwang commented Sep 4, 2023 •

edited

Loading

[ENH] Support dplyr 1.1.0~1.1.2 #189

[ENH] Support dplyr 1.1.0~1.1.2 #189

Comments

pwwang commented Sep 4, 2023 • edited Loading

Feature Type

Problem Description

Feature Description

Additional Context

pwwang commented Sep 4, 2023 •

edited

Loading