`cur_group()` and size zero grouped data frame edge case bug #6304

DavisVaughan · 2022-06-19T22:15:47Z

This has to do with the number of rows returned by group_data(), and therefore by group_keys()

library(dplyr)

df <- tibble(x = integer())
gdf <- group_by(df, x)

mutate(df, y = cur_group())
#> # A tibble: 0 × 2
#> # … with 2 variables: x <int>, y <tibble[,0]>

mutate(gdf, y = cur_group())
#> Error in `mutate()`:
#> ! Problem while computing `y = cur_group()`.
#> Caused by error in `vec_slice()`:
#> ! Can't subset elements past the end.
#> ℹ Location 1 doesn't exist.
#> ℹ There are only 0 elements.

# Has 1 row
group_keys(df)
#> # A tibble: 1 × 0

# Has 0 rows
group_keys(gdf)
#> # A tibble: 0 × 1
#> # … with 1 variable: x <int>

We do this workaround when there are zero groups, but it only applies to the group rows

dplyr/R/data-mask.R

Lines 4 to 8 in 55dfc1c

    
           rows <- group_rows(data) 
        
           # workaround for when there are 0 groups 
        
           if (length(rows) == 0) { 
        
             rows <- list(integer()) 
        
           }

It seems like we need to make a similar kind of patch to group_keys() as well

dplyr/R/data-mask.R

Line 26 in 55dfc1c

private$keys <- group_keys(data)

Maybe it should be set to vec_init(group_keys(), n = 1) if there are no groups? That would allow cur_group() to return a size 1 result, which would then be recycled back to size 0

That would give this result, where you can see the initialized 1 row keys if you really want to

library(dplyr)

df <- tibble(x = integer())
gdf <- group_by(df, x)

mutate(gdf, y = print(cur_group()))
#> # A tibble: 1 × 1
#>       x
#>   <int>
#> 1    NA

#> # A tibble: 0 × 2
#> # Groups:   x [0]
#> # … with 2 variables: x <int>, y <tibble[,1]>

The text was updated successfully, but these errors were encountered:

DavisVaughan · 2022-06-19T22:25:23Z

Or maybe current_key(), used by cur_group(), needs to be aware of the case where the keys are size 0, and be implemented like this:

    current_key = function() {
      keys <- private$keys

      if (vec_size(keys) == 0L) {
        private$keys
      } else {
        vec_slice(private$keys, self$get_current_group())
      }
    },

Which would give this result which feels more intuitive in this case:

library(dplyr)

df <- tibble(x = integer())
gdf <- group_by(df, x)

mutate(gdf, y = print(cur_group()))
#> # A tibble: 0 × 1
#> # … with 1 variable: x <int>

#> # A tibble: 0 × 2
#> # Groups:   x [0]
#> # … with 2 variables: x <int>, y <tibble[,1]>

DavisVaughan added a commit to DavisVaughan/dplyr that referenced this issue Jun 28, 2022

Add test for tidyverse#6304

13d3993

DavisVaughan mentioned this issue Jun 28, 2022

Implement mutate(.when =) #6313

Closed

1 task

DavisVaughan added a commit to DavisVaughan/dplyr that referenced this issue Jun 28, 2022

Add test for tidyverse#6304

70b2225

hadley added bug an unexpected problem or unintended behavior grouping 👨‍👩‍👧‍👦 labels Jul 21, 2022

DavisVaughan mentioned this issue Aug 19, 2022

Patch current_key() to work with zero row grouped data frames #6423

Merged

DavisVaughan closed this as completed in #6423 Aug 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`cur_group()` and size zero grouped data frame edge case bug #6304

`cur_group()` and size zero grouped data frame edge case bug #6304

DavisVaughan commented Jun 19, 2022 •

edited

Loading

DavisVaughan commented Jun 19, 2022

cur_group() and size zero grouped data frame edge case bug #6304

cur_group() and size zero grouped data frame edge case bug #6304

Comments

DavisVaughan commented Jun 19, 2022 • edited Loading

DavisVaughan commented Jun 19, 2022

`cur_group()` and size zero grouped data frame edge case bug #6304

`cur_group()` and size zero grouped data frame edge case bug #6304

DavisVaughan commented Jun 19, 2022 •

edited

Loading