Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run_ml does not support "training_frac = 1.0" to get best hyperparameter value #345

Open
joannacolovas opened this issue Aug 28, 2024 · 0 comments

Comments

@joannacolovas
Copy link

run_ml does not support arguement "training_fract = 1.0", and gives following error message:
Error in check_training_frac(training_frac) :
training_frac must be a numeric between 0 and 1.
You provided: 1

training_frac = 1.0 using a random forest model to find best mtry hyperparameter value.

Function "check_training_frac" is exclusive of (0,1). (mikropml/R/checks.R line 131-140:

check_training_frac <- function(frac) {
  if (!is.numeric(frac) | (frac <= 0 | frac >= 1)) {
    stop(paste0(
      "`training_frac` must be a numeric between 0 and 1.\n",
      "    You provided: ", frac
    ))
  } else if (frac < 0.5) {
    warning("`training_frac` is less than 0.5. The training set will be smaller than the testing set.")
  }
}

Reproducible example

  # random forest model trying to set training_fract to 1.0 to have no test set to get best mtry value
  run_ml(otu_small, 
         "rf",
         outcome_colname = "dx",
         training_frac = 1.0,
         seed = 5, 
         kfold = 5, 
         cv_times = 100, 
         hyperparameters = list(mtry =  c(84, 100, 150, 200, 300)), 
         calculate_performance = FALSE)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant