Error while using tune_grid ($ operator is invalid for atomic vectors) #150

dilsherdhillon opened this issue Jan 10, 2020 · 8 comments · Fixed by #180

dilsherdhillon opened this issue Jan 10, 2020 · 8 comments · Fixed by #180


Hi Folks!
I'm running into trouble with the tuning parameters using the tune_grid function while using tune() to set hyperparameters in {parsnip}.


### Create random data  

dta <- tibble::tibble(binary = rbinom(1000, 1, 0.7),
                      feature_1 = rgamma(1000, 2, 1),
                      feature_2 = rnorm(1000, 10, 20),
                      feature_3 = rpois(1000, 5))

### create a recipe  
recipe_df <- dta %>%
  recipe(binary ~ .) %>%
  step_center(all_predictors()) %>%

### set the engine  
engine <- boost_tree(
  mode = "classification",
  trees = tune(),
  min_n = tune(),
  tree_depth = tune(),
  learn_rate = tune()
  ) %>%
  set_mode("classification") %>%

### create a resamples tibble  
df_resamples <- vfold_cv(dta, v = 5)  

### grid of hyperparameters to tune  
grid <-  expand.grid(
  trees = c(50),
  min_n = c(3, 5, 7),
  tree_depth = 100,
  learn_rate = c(0.1, 0.5, 0.9)

### fit te model  
fits <- tune_grid(
  model = engine,
  resamples  = df_resamples,
  grid = grid,
  metrics  = metric_set(roc_auc)
#> x Fold1: internal: Error: $ operator is invalid for atomic vectors
#> x Fold2: internal: Error: $ operator is invalid for atomic vectors
#> x Fold3: internal: Error: $ operator is invalid for atomic vectors
#> x Fold4: internal: Error: $ operator is invalid for atomic vectors
#> x Fold5: internal: Error: $ operator is invalid for atomic vectors
#> Warning: All models failed in tune_grid(). See the `.notes` column.

#> [[1]]
#> # A tibble: 1 x 1
#>   .notes                                                   
#>   <chr>                                                    
#> 1 internal: Error: $ operator is invalid for atomic vectors
#> [[2]]
#> # A tibble: 1 x 1
#>   .notes                                                   
#>   <chr>                                                    
#> 1 internal: Error: $ operator is invalid for atomic vectors
#> [[3]]
#> # A tibble: 1 x 1
#>   .notes                                                   
#>   <chr>                                                    
#> 1 internal: Error: $ operator is invalid for atomic vectors
#> [[4]]
#> # A tibble: 1 x 1
#>   .notes                                                   
#>   <chr>                                                    
#> 1 internal: Error: $ operator is invalid for atomic vectors
#> [[5]]
#> # A tibble: 1 x 1
#>   .notes                                                   
#>   <chr>                                                    
#> 1 internal: Error: $ operator is invalid for atomic vectors

Created on 2020-01-10 by the reprex package (v0.3.0)

dwhdai commented Feb 13, 2020

Also running into this issue

topepo commented Feb 16, 2020

The problem is that your outcome is numeric and you are trying to fit a classification model. We expect a factor predictor.

There is a better error message for this case but that should happen in parsnip:


dta <- tibble::tibble(binary = rbinom(1000, 1, 0.7),
                      feature_1 = rgamma(1000, 2, 1),
                      feature_2 = rnorm(1000, 10, 20),
                      feature_3 = rpois(1000, 5))

### set the engine  
engine <- 
  boost_tree() %>%
  set_mode("classification") %>%

engine %>% fit(binary ~ ., data = dta)
#> Error: For classification models, the outcome should be a factor.

Created on 2020-02-16 by the reprex package (v0.3.0)

We need to find out where it failed prior to this and write a better message.

I believe this has to do with how catch_and_log_fit() finds the error to report. It is not logging the thing we want it to.

dta <- tibble::tibble(binary = rbinom(1000, 1, 0.7),
                      feature_1 = rgamma(1000, 2, 1),
                      feature_2 = rnorm(1000, 10, 20),
                      feature_3 = rpois(1000, 5))

xbg_spec <- boost_tree(mode = "classification", learn_rate = tune()) %>%
  set_mode("classification") %>%

df_resamples <- vfold_cv(dta, v = 3)

wf <- workflow() %>%
  add_model(xbg_spec) %>%
  add_formula(binary ~ .)

## error we want
tune:::train_formula(df_resamples$splits[[1]], wf) %>%
  tune:::train_model(grid = tibble(learn_rate = 0.00000862), 
                     control = control_workflow())
#> Error: For classification models, the outcome should be a factor.

## NOT error we want
  tune:::train_formula(df_resamples$splits[[1]], wf) %>%
    tune:::train_model(grid = tibble(learn_rate = 0.00000862), 
                       control = control_workflow()),
  control = control_grid()
#> Error: $ operator is invalid for atomic vectors

Created on 2020-02-25 by the reprex package (v0.3.0)

It's this line, after the call to catcher(). The error message we want is caught by catcher() and stored in result but then that next line tries to get FIT INCEPTION that isn't there.

I can't find tests for catch_and_log_fit(). How likely is it that it's behaving in unexpected ways in more than just this situation? What do we think the best next step is?

Copy link

@topepo Thank you for catching this!

Copy link

Experiencing this under a different context. I tried to run a model where I specify one argument of the model with tune() but a hard coded value for the other argument. This raises the same error. See the reprex:

iono_rec <-
  recipe(Class ~ ., data = Ionosphere)  %>%
  step_zv(all_predictors()) %>%
  step_mutate(V1 = factor(V1), Class = factor(Class)) %>% 
  step_dummy(V1) %>% 

resample <- bootstraps(Ionosphere, times = 5)

# Define a tune for one argument but specify the other
svm_mod <-
  svm_rbf(cost = tune(), rbf_sigma = -0.25) %>%
  set_mode("classification") %>%

# Errors
tune_grid(svm_mod, iono_rec, resample, grid = 3)
#> x Bootstrap1: model 1/3 (predictions): Error: $ operator is invalid for atomic vectors
#> x Bootstrap1: model 2/3 (predictions): Error: $ operator is invalid for atomic vectors
#> x Bootstrap1: model 3/3 (predictions): Error: $ operator is invalid for atomic vectors
#> x Bootstrap2: model 1/3 (predictions): Error: $ operator is invalid for atomic vectors
#> x Bootstrap2: model 2/3 (predictions): Error: $ operator is invalid for atomic vectors
#> x Bootstrap2: model 3/3 (predictions): Error: $ operator is invalid for atomic vectors
#> x Bootstrap3: model 1/3 (predictions): Error: $ operator is invalid for atomic vectors
#> x Bootstrap3: model 2/3 (predictions): Error: $ operator is invalid for atomic vectors
#> x Bootstrap3: model 3/3 (predictions): Error: $ operator is invalid for atomic vectors
#> x Bootstrap4: model 1/3 (predictions): Error: $ operator is invalid for atomic vectors
#> x Bootstrap4: model 2/3 (predictions): Error: $ operator is invalid for atomic vectors
#> x Bootstrap4: model 3/3 (predictions): Error: $ operator is invalid for atomic vectors
#> x Bootstrap5: model 1/3 (predictions): Error: $ operator is invalid for atomic vectors
#> x Bootstrap5: model 2/3 (predictions): Error: $ operator is invalid for atomic vectors
#> x Bootstrap5: model 3/3 (predictions): Error: $ operator is invalid for atomic vectors
#> Warning: All models failed in tune_grid(). See the `.notes` column.
#> # Bootstrap sampling 
#> # A tibble: 5 x 4
#>   splits            id         .metrics .notes          
#>   <list>            <chr>      <list>   <list>          
#> 1 <split [351/120]> Bootstrap1 <NULL>   <tibble [3 × 1]>
#> 2 <split [351/126]> Bootstrap2 <NULL>   <tibble [3 × 1]>
#> 3 <split [351/132]> Bootstrap3 <NULL>   <tibble [3 × 1]>
#> 4 <split [351/136]> Bootstrap4 <NULL>   <tibble [3 × 1]>
#> 5 <split [351/124]> Bootstrap5 <NULL>   <tibble [3 × 1]>

# Define tune for both arguments
svm_mod <-
  svm_rbf(cost = tune(), rbf_sigma = tune()) %>%
  set_mode("classification") %>%

# Runs fine
tune_grid(svm_mod, iono_rec, resample, grid = 3)
#> # Bootstrap sampling 
#> # A tibble: 5 x 4
#>   splits            id         .metrics         .notes          
#>   <list>            <chr>      <list>           <list>          
#> 1 <split [351/120]> Bootstrap1 <tibble [6 × 5]> <tibble [0 × 1]>
#> 2 <split [351/126]> Bootstrap2 <tibble [6 × 5]> <tibble [0 × 1]>
#> 3 <split [351/132]> Bootstrap3 <tibble [6 × 5]> <tibble [0 × 1]>
#> 4 <split [351/136]> Bootstrap4 <tibble [6 × 5]> <tibble [0 × 1]>
#> 5 <split [351/124]> Bootstrap5 <tibble [6 × 5]> <tibble [0 × 1]>

Here's the SI:

topepo commented May 26, 2020

It is always better to start a new issue when you have an error (and reference this issue). We are unlikely to look at closed issues.

In this particular case, you are using a negative value for the scale parameter in the kernel function. If you have that > 0, it will run fine.

Copy link

