Extend unnecessary_lambda_linter to look for "inner comparisons" #2300

MichaelChirico · 2023-11-17T19:40:13Z

Part of #884

No hits on {lintr}.

Here's another one that's a bit in limbo since we ultimately decided against using it internally, the argument being that sometimes having the comparison as close as possible to its usage is preferable for readability.

If we think this is still on balance useful enough to include, there's a few clean-up steps that need to be added, e.g. customizing the lint message for sapply() vs. vapply().

codecov-commenter · 2023-11-17T21:10:20Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (ef82966) 99.39% compared to head (624c3f2) 99.40%.

❗ Current head 624c3f2 differs from pull request most recent head 9620a8c. Consider uploading reports for the commit 9620a8c to get more accurate results

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #2300   +/-   ##
=======================================
  Coverage   99.39%   99.40%           
=======================================
  Files         122      122           
  Lines        5495     5523   +28     
=======================================
+ Hits         5462     5490   +28     
  Misses         33       33

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

AshesITR · 2023-11-18T19:22:04Z

Is it really more efficient to not produce a logical vector directly via vapply?

MichaelChirico · 2023-11-18T20:01:02Z

Is it really more efficient to not produce a logical vector directly via vapply?

well, there's the cases where it's really obviously worse:

v = sample(letters, 1e7, TRUE)

microbenchmark(times = 10L, vapply(v, function(vi) vi == 'a', logical(1L)), vapply(v, `==`, 'a', FUN.VALUE=logical(1L)), v == 'a')
# Unit: milliseconds
#                                            expr        min         lq
#  vapply(v, function(vi) vi == "a", logical(1L)) 6218.61526 6272.04090
#   vapply(v, `==`, "a", FUN.VALUE = logical(1L)) 3660.20617 3745.33906
#                                        v == "a"   46.01302   46.97116
#        mean     median         uq        max neval cld
#  6486.86968 6457.15351 6526.12283 7031.29944    10 a  
#  3950.88407 3879.94950 4201.16618 4397.80781    10  b 
#    48.88616   47.88342   49.99104   56.04004    10   c

But I guess you have in mind more general cases:

DF = data.frame(replicate(times = 20L, rnorm(1000L), simplify = FALSE))

microbenchmark(times = 10000L, vapply(DF, sum, numeric(1L)) > 0, vapply(DF, function(x) sum(x) > 0, logical(1L)))
# Unit: microseconds
#                                             expr    min     lq     mean median      uq      max neval cld
#                 vapply(DF, sum, numeric(1L)) > 0 29.647 31.276 32.87099 31.749 32.2575 2662.433 10000  a 
#  vapply(DF, function(x) sum(x) > 0, logical(1L)) 36.162 38.470 40.01192 39.138 39.9675 2746.665 10000   b

In cases where the lambda can't be eliminated, though, it's basically a toss-up:

microbenchmark(times = 10000L, vapply(DF, function(x) sum(abs(x)), numeric(1L)) > 10, vapply(DF, function(x) sum(abs(x)) > 10, logical(1L)))
# Unit: microseconds
#                                                   expr    min     lq     mean   median       uq       max neval cld
#  vapply(DF, function(x) sum(abs(x)), numeric(1L)) > 10 56.019 60.498 98.04062 101.6445 103.8465 10194.537 10000   a
#  vapply(DF, function(x) sum(abs(x)) > 10, logical(1L)) 55.706 60.430 96.81288 101.4660 103.7045  8070.142 10000   a

AshesITR · 2023-11-19T13:20:58Z

Funny enough colSums is actually even worse in the example:

microbenchmark(times = 10000L, vapply(DF, sum, numeric(1L)) > 0, vapply(DF, function(x) sum(x) > 0, logical(1L)), colSums(DF) > 0)
# Unit: microseconds
#                                             expr    min       lq      mean   median       uq      max neval
#                 vapply(DF, sum, numeric(1L)) > 0 20.243  21.6950  23.40762  22.4530  23.4465   68.499 10000
#  vapply(DF, function(x) sum(x) > 0, logical(1L)) 24.900  26.6695  28.85965  27.5250  28.7680 1169.621 10000
#                                  colSums(DF) > 0 96.006 101.3280 123.05545 105.1525 109.8010 5607.288 10000

AshesITR · 2023-11-19T13:25:12Z

From the experiments - should we make the linter more conservative in the sense that it only lints lambdas of the form function(x) foo(x) > y with arbitrary comparison and a one or zero calls foo()?

I am getting a small but consistently better performance for vapply(DF, function(x) sum(abs(x)) > 10 too

microbenchmark(times = 10000L, vapply(DF, function(x) sum(abs(x)), numeric(1L)) > 10, vapply(DF, function(x) sum(abs(x)) > 10, logical(1L)))
# Unit: microseconds
#                                                  expr    min     lq     mean  median     uq      max neval
#  vapply(DF, function(x) sum(abs(x)), numeric(1L)) > 10 36.435 38.459 48.00805 39.7680 40.877 3371.193 10000
#  vapply(DF, function(x) sum(abs(x)) > 10, logical(1L)) 36.440 38.406 47.02529 39.6865 40.861 3849.850 10000

MichaelChirico · 2023-11-19T17:34:25Z

should we make the linter more conservative in the sense that it only lints lambdas of the form function(x) foo(x) > y with arbitrary comparison and a one or zero calls foo()?

Just for vapply() though, right? Or we can just drop vapply() anyway (since the FUN.VALUE= argument gets in the way of making the "inner" vs. "outer" versions look close enough for readability.

Anyway, looks like my comment was lost or I never hit enter -- I am wondering if we should just merge this logic into unnecessary_lambda_linter() so that foo(x) {op} {const} and x {op} {const} get lints.

AshesITR · 2023-11-19T17:47:48Z

Just for vapply() though, right? Or we can just drop vapply() anyway (since the FUN.VALUE= argument gets in the way of making the "inner" vs. "outer" versions look close enough for readability.

Why? It doesn't seem too bad for readability?

Anyway, looks like my comment was lost or I never hit enter -- I am wondering if we should just merge this logic into unnecessary_lambda_linter() so that foo(x) {op} {const} and x {op} {const} get lints.

SGTM

MichaelChirico · 2023-11-19T19:24:08Z

SGTM

Interpreting as "merge with unnecessary_lambda_linter()."

Marking as draft until I can look at the recent unnecessary_lambda_linter() issue, if there's something to fix there I'd prefer to do that first before attempting the merge.

MichaelChirico · 2023-11-20T06:37:49Z

OK, now ~~based against #2320 and~~ incorporated into unnecessary_lambda_linter(). It's still in an immature state, but pretty close to ready for initial merge.

R/unnecessary_lambda_linter.R

…parison

NEWS.md

New inner_comparison_linter

b48eba6

MichaelChirico mentioned this pull request Nov 17, 2023

Upstreaming linters from Google's internal linting suite #884

Closed

76 tasks

roxy for examples

09ace01

MichaelChirico added the google-linters label Nov 18, 2023

Merge branch 'main' into inner_comparison

78d0ab6

MichaelChirico marked this pull request as draft November 19, 2023 19:24

MichaelChirico added 4 commits November 19, 2023 21:41

ensure no previous expression as well

8519951

Merge branch 'main' into inner_comparison

f6018eb

Merge branch 'ul-rhs-in' into inner_comparison

e979da4

initial merge into unnecessary_lambda_linter (not working yet)

869959e

MichaelChirico changed the base branch from main to ul-rhs-in November 20, 2023 06:00

MichaelChirico added 5 commits November 19, 2023 22:12

Working now

6ea4242

customized lint

3c4ffe1

shared infix metadata

ce46fcf

fix vestigial name, stage new example

35ba030

Restrict to non-nested calls

12a52a0

MichaelChirico marked this pull request as ready for review November 20, 2023 06:36

Base automatically changed from ul-rhs-in to main November 20, 2023 06:46

Merge branch 'main' into inner_comparison

690720e

AshesITR reviewed Nov 20, 2023

View reviewed changes

R/unnecessary_lambda_linter.R Show resolved Hide resolved

Merge remote-tracking branch 'origin/main' into inner_comparison

8f7998d

Merge remote-tracking branch 'origin/inner_comparison' into inner_com…

68ad492

…parison

AshesITR reviewed Nov 20, 2023

View reviewed changes

NEWS.md Outdated Show resolved Hide resolved

MichaelChirico mentioned this pull request Nov 20, 2023

Improve lint message for suggesting "outer comparison" in unnecessary_lambda_linter() #2323

Open

MichaelChirico changed the title ~~New inner_comparison_linter~~ Extend unnecessary_lambda_linter to look for "inner comparisons" Nov 20, 2023

MichaelChirico added 2 commits November 20, 2023 21:32

Merge branch 'main' into inner_comparison

480a5e2

remove dup NEWS entry

86005d5

AshesITR previously approved these changes Nov 21, 2023

View reviewed changes

Merge branch 'main' into inner_comparison

9620a8c

MichaelChirico dismissed AshesITR’s stale review via 9620a8c November 21, 2023 06:49

AshesITR approved these changes Nov 21, 2023

View reviewed changes

MichaelChirico merged commit 209e250 into main Nov 21, 2023
20 checks passed

MichaelChirico deleted the inner_comparison branch November 21, 2023 07:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend unnecessary_lambda_linter to look for "inner comparisons" #2300

Extend unnecessary_lambda_linter to look for "inner comparisons" #2300

MichaelChirico commented Nov 17, 2023

codecov-commenter commented Nov 17, 2023 •

edited

Loading

AshesITR commented Nov 18, 2023

MichaelChirico commented Nov 18, 2023

AshesITR commented Nov 19, 2023

AshesITR commented Nov 19, 2023 •

edited

Loading

MichaelChirico commented Nov 19, 2023

AshesITR commented Nov 19, 2023 •

edited

Loading

MichaelChirico commented Nov 19, 2023 •

edited

Loading

MichaelChirico commented Nov 20, 2023 •

edited

Loading

Extend unnecessary_lambda_linter to look for "inner comparisons" #2300

Extend unnecessary_lambda_linter to look for "inner comparisons" #2300

Conversation

MichaelChirico commented Nov 17, 2023

codecov-commenter commented Nov 17, 2023 • edited Loading

Codecov Report

AshesITR commented Nov 18, 2023

MichaelChirico commented Nov 18, 2023

AshesITR commented Nov 19, 2023

AshesITR commented Nov 19, 2023 • edited Loading

MichaelChirico commented Nov 19, 2023

AshesITR commented Nov 19, 2023 • edited Loading

MichaelChirico commented Nov 19, 2023 • edited Loading

MichaelChirico commented Nov 20, 2023 • edited Loading

codecov-commenter commented Nov 17, 2023 •

edited

Loading

AshesITR commented Nov 19, 2023 •

edited

Loading

AshesITR commented Nov 19, 2023 •

edited

Loading

MichaelChirico commented Nov 19, 2023 •

edited

Loading

MichaelChirico commented Nov 20, 2023 •

edited

Loading