Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mutating joins relationship documentation issues #7623

Closed
bounlu opened this issue Jan 6, 2025 · 0 comments
Closed

Mutating joins relationship documentation issues #7623

bounlu opened this issue Jan 6, 2025 · 0 comments

Comments

@bounlu
Copy link

bounlu commented Jan 6, 2025

Mutate-joins (dplyr) documentation says:

relationship

Handling of the expected relationship between the keys of x and y. If the expectations chosen from the list below are invalidated, an error is thrown.

NULL, the default, doesn't expect there to be any relationship between x and y. However, for equality joins it will check for a many-to-many relationship (which is typically unexpected) and will warn if one occurs, encouraging you to either take a closer look at your inputs or make this relationship explicit by specifying "many-to-many".

See the Many-to-many relationships section for more details.

"one-to-one" expects:
Each row in x matches at most 1 row in y.
Each row in y matches at most 1 row in x.

"one-to-many" expects:
Each row in y matches at most 1 row in x.

"many-to-one" expects:
Each row in x matches at most 1 row in y.

"many-to-many" doesn't perform any relationship checks, but is provided to allow you to be explicit about this relationship if you know it exists.

relationship doesn't handle cases where there are zero matches. For that, see unmatched.

I see there are 2 issues:

  1. one-to-many and many-to-one description looks awkward and reversed to me. Logically, it should specify from left to right, x -> y. So one-to-many should mean "Rows in x may match multiple rows in y". Similarly, many-to-one should mean "Multiple rows in x may match same row in y".

  2. Specifying relationship explicitly as one-to-many or many-to-one do not generate any warning or error if there is no such matching in the data, i.e. if only one-to-one exists. I would expect an error would be thrown if the specified relationship does not exist in the matching as the documentation says, otherwise I don't get the point of specifying the relationship explicitly.

I have read this but I believe the above issues still remain to be resolved.

@bounlu bounlu closed this as completed Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant