Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add chapter on validation and internal tuning #829

Merged
merged 52 commits into from
Nov 7, 2024
Merged

add chapter on validation and internal tuning #829

merged 52 commits into from
Nov 7, 2024

Conversation

sebffischer
Copy link
Member

@sebffischer sebffischer commented Aug 16, 2024

TODOs:

Copy link

Preview

@sumny
Copy link
Member

sumny commented Aug 18, 2024

minor wording suggestions (but can also be left out):

  • "where we would fit again and again with different iterations numbers." -> where we would fit the model again and again with different iterations numbers.
  • lightgbm -> LightGBM
  • catboost -> CatBoost
  • "test" to use the test set as validation data, which only works in combination with resampling and tuning." this sounds as if we would leak test data but it is just the test split from the resampling (so the validation split during HPO, correct?)
  • "we can no access this through the $model slit." no -> now; slit -> slot
  • "for training to end" -> for training to stop early
  • "By using early stopping, we were able to already terminate training 38 rounds. " -> terminate training after 38 rounds; (also training might have been performed actually longer, due to the patience?)
  • "We see that after a logloss plateaus." -> We can see that the logloss plateaus after 38 rounds.
  • "as it allows to the internal tuning of a Learner with (non-internal) hyperparameter" -> allows to perform internal tuning ...
  • "In such scenarios, what one" -> In such scenarios, one
  • "We also have to say" -> We also have to specify
  • "You can find out which ones support this feature by checking the corresponding documentation." Maybe also give one example
  • "as we specified validate =“test”. By visualizing the results we can see an inverse relationship between the two tuning parameters: a larger step size (eta) requires more boosting iterations (nrounds`)." Formatting is weird
  • "We can also prediction objects" Verb missing
  • "we its predict_sets field. " Verb missing
  • "Here can only select from those predict sets that we configured the Learner to predict on." Verb missing
  • "Because the penguins task" -> As
  • "select an evaluation metric to classification error" -> set the
  • "Then, show" -> Then, visualize (or print?)
  • "lightgbm" -> LightGBM
  • "xgboost" -> XGBoost
  • "why the code above errs" -> why the code above errrors
  • "Don’t tune any other parameters than the learning rate, which is possible by using tnr("internal")" somewhat unclear. Tune nrounds internally and then only tune the learning rate?

otherwise, great job!

@jemus42
Copy link
Member

jemus42 commented Sep 5, 2024

I'm trying to add early stopping to the XGBoost learner in my benchmark based on this chapter, and I'm not sure whether I just misunderstand a few things or maybe the chapter could be extended in that regard.

My problem is that I'm using an AutoTuner with a given search space for tuning, but I would like to also internally use early stopping and thereby tune nrounds.

One of my naive attempts below:

library(mlr3)
library(mlr3tuning)
library(mlr3pipelines)
library(mlr3proba)
library(mlr3extralearners)

task = tsk("lung")
xgb_base = lrn("surv.xgboost.cox", 
               early_stopping_rounds = 10,
               nrounds = to_tune(upper = 1000, internal = TRUE),
               tree_method = "hist", booster = "gbtree")

xgb_glearn = po("fixfactors") %>>%
  po("imputesample", affect_columns = selector_type("factor")) %>>%
  po("encode", method = "treatment") %>>%
  po("removeconstants") %>>%
  xgb_base |>
  as_learner()

set_validate(xgb_glearn, "test")

xgb_autotuner = auto_tuner(
  learner = xgb_glearn,
  search_space = ps(
    surv.xgboost.cox.eta = p_dbl(0.001, 1, logscale = TRUE),
    surv.xgboost.cox.max_depth = p_int(1, 20),
    surv.xgboost.cox.subsample = p_dbl(0, 1),
    surv.xgboost.cox.colsample_bytree = p_dbl(0, 1),
    surv.xgboost.cox.grow_policy = p_fct(c("depthwise", "lossguide"))
  ),
  resampling = rsmp("cv", folds = 3),
  measure = msr("surv.cindex"),
  terminator = trm("evals", n_evals = 20, k = 0),
  tuner = tnr("random_search")
)

Resulting in the not unexpected error

Error in .__AutoTuner__initialize(self = self, private = private, super = super,  : 
  If the values of the ParamSet of the Learner contain TuneTokens you cannot supply a search_space.

I'm not sure how to indicate to my AutoTuner that I would like to both

  1. tune using the supplied search space using a given metric (not XGBoost's internal one)
  2. have XGBoost use early stopping for nrounds under the hood

@sebffischer
Copy link
Member Author

sebffischer commented Sep 6, 2024

Also, are you aware that xgboost will use the optimal model during prediction and NOT the final model?

--> You should be less worried about a too high patience parameter (except for increased runtime I guess).

@jemus42
Copy link
Member

jemus42 commented Sep 6, 2024

you are accessing the final model fit but in the final model fit there is no early stopping.

Ah right, of course, makes sense 😅
I don't think I strictly need to access those, just for now I'm trying to get a feeling for how the early stopping works and behaves.
I also found xgb_autotuner$tuning_result$internal_tuned_values[[1]]$surv.xgboost.cox.nrounds by now, so that's helpful 👍🏻

Also, are you aware that xgboost will use the optimal model during prediction and NOT the final model?

I was banking on that -- my main concern is to avoid overfitting in the benchmark, and saving some compute would be a bonus but not a must.

Thanks for the clarifications!

@sebffischer sebffischer merged commit 36df925 into main Nov 7, 2024
1 check passed
@sebffischer sebffischer deleted the validation branch November 7, 2024 11:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants