Don't resolve columns with tidyselect when tbl
cannot be materialized
#529
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
{tidyselect}
requires a data frame present in order to resolve the column selections against. Therefore, this PR ensures that whencreate_agent(tbl = )
receives a table that cannot (yet) materialize, the validation steps dont't/can't use tidyselect.This preserves
v0.11
behavior, where use of tidyselect triggers materialization of the tbl and thus always errors if tbl cannot (yet) materialize:In the PR, this now errors with a slightly more informative message:
Relatedly, the PR also allows the bypassing of tidyselect when tidyselect is not strictly necessary (e.g., when
columns
is a character vector). This resolves #528 by not forwarding thec(...)
character vector to tidyselect at all.In sum,
{tidyselect}
support in columns is a lot more limited when thetbl
cannot be materialized, but this is inevitable. This behavior already existed inv0.11
and it's just a genuine tradeoff users will have to make if they want an extreme version of laziness for theirtbl
.Remaining consideration with
rows_distinct()
Unfortunately, there is one regression with
rows_distinct()
, because inv0.12
we have thecolumns = everything()
default, whereas inv0.11
we had thecolumns = NULL
default. When thetbl
cannot be materialized,rows_distinct()
errors onv0.12
but passes through inv0.11
.More concretely put, this used to work in
v0.11
but not anymore inv0.12
:This behavior actually doesn't have anything to do with
{tidyselect}
though:rows_distinct()
is exceptional in thatcolumns = NULL
becomesdistinct({{ NULL }})
ininterrogate()
, anddistinct(NULL)
happens to have a special behavior of checking for row-level distinctness across all columns. In other words,NULL
is never resolved to a column selection, which is also whyrows_distinct()
used to not show anything in thecolumns
column of the agent report.I have yet to add this behavior of
rows_distinct()
back in. Adding back support for this should be simple (we'd just intercepteverything()
and pass downNULL
instead) - it'd just require being explicit about this exceptionalism ofrows_distinct()
in the code.Also, just to reiterate, this extreme laziness will always work if
columns
is supplied as character vector orvars()
; it's just the dynamic tidyselect expression that requirestbl
to be materialize-able.Let me know what you think!