More details of formula usage in mgcv engine docs when using workflow #770

qiushiyan · 2022-07-15T19:35:08Z

We need to include more details about using gam formula in the engine docgen_additive_mod(engien = "mgcv"). The engine doc only shows model fitting examples when using gam formula in fit() directly. When using a workflow with recipes, the gam formula needs to be declared in add_model alongside with the model spec

# no inline function in recipe
rec <- recipe(formula = mpg ~ ., data = mtcars)
spec <- gen_additive_mod() %>% 
    set_engine("mgcv")

wf <- workflow() %>% 
    add_recipe(rec) %>%  
    add_model(spec, formula = mpg ~ wt + gear + cyl + s(disp, k = 10))  # use gam formula here

The text was updated successfully, but these errors were encountered:

simonpcouch · 2022-07-26T17:55:53Z

A relevant Community post with reprex: https://community.rstudio.com/t/error-in-fit-xy-with-gam-model/143065

Steviey · 2022-08-12T09:23:29Z

+1

siavash-babaei · 2023-03-07T23:23:56Z

Assume we have a response variable, outcome, one numerical predictor, pred_num, and one categorical variable, prec_fac.

Assume GAM formula is:

gam_formula <- "outcome ~ ." |> as.formula()

Then, you preprocess it through recipes with:

data_recipe <- recipes::recipe(
  formula = gam_formula,
  data    = data_train
) |>
  recipes::step_dummy(prec_fac) |>
  # Other Steps ...

# Train the recipe
data_recipe_prep <- data_recipe |>
  recipes::prep(training = data_train)

# Apply to training data
data_train_prep <- data_recipe_prep |>
  recipes::bake(new_data = NULL)

# Apply to test data
data_test_prep <- data_recipe_prep |>
  recipes::bake(new_data = data_test)

For things to work elsewhere, say in tune::tune_grid(), you need to add the following to workflows::add_model():

formula_alt = gam_formula |> terms.formula(data = data_train_prep)

So, whenever we have categorical variables in the model formula, you would need to manually preprocess data and use the terms from that.

This change of formulae in particular, is very confusing, and could potentially cause serious inconsistencies. Where do you use gam_formula vs formula_alt and how would it effect a complex workflow? I hope this gets addressed soon.

simonpcouch · 2023-06-28T14:27:13Z

This may be a workflows or hardhat change rather than parsnip, but it might be worth looking out for indicative input in add_formula() or add_recipe() and warn if the formula looks like it might need to be passed as a model formula but add_model(formula) is missing. This is a bit tough since add*() should be able to be called in either order, so maybe that waits for fit.workflow() to be triggered.

github-actions · 2023-11-21T00:53:07Z

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

EmilHvitfeldt added the documentation label Jul 15, 2022

simonpcouch mentioned this issue Dec 5, 2022

Recipes not available with gen_additive_mod() #849

Closed

mikemahoney218 mentioned this issue May 3, 2023

Using spatialsample in mgcv::gam tidymodels/spatialsample#139

Closed

This was referenced Nov 2, 2023

misleading error re: fit_xy() with GAMs #1014

Closed

improve docs and errors re: model formulas #1015

Merged

simonpcouch closed this as completed in #1015 Nov 6, 2023

github-actions bot locked and limited conversation to collaborators Nov 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More details of formula usage in mgcv engine docs when using workflow #770

More details of formula usage in mgcv engine docs when using workflow #770

qiushiyan commented Jul 15, 2022 •

edited

Loading

simonpcouch commented Jul 26, 2022

Steviey commented Aug 12, 2022

siavash-babaei commented Mar 7, 2023 •

edited

Loading

simonpcouch commented Jun 28, 2023

github-actions bot commented Nov 21, 2023

More details of formula usage in mgcv engine docs when using workflow #770

More details of formula usage in mgcv engine docs when using workflow #770

Comments

qiushiyan commented Jul 15, 2022 • edited Loading

simonpcouch commented Jul 26, 2022

Steviey commented Aug 12, 2022

siavash-babaei commented Mar 7, 2023 • edited Loading

simonpcouch commented Jun 28, 2023

github-actions bot commented Nov 21, 2023

qiushiyan commented Jul 15, 2022 •

edited

Loading

siavash-babaei commented Mar 7, 2023 •

edited

Loading