FeatureImp: Problem with binary output #194

snvv · 2022-07-24T15:53:45Z

The output of the FeatureImp is very sensitive to number of inputs
For example using 50 or 40 explanatory variables the output of FeatureImp$new(predictor, loss = "ce") is zero. However, when the variables are 10 or 20 it produce some sensible output
see a reproducible example:

library(randomForest)
library(mlbench)
library(caret)
library(e1071)
# Load Dataset
data(Sonar)
dataset <- Sonar

#10 folds repeat 3 times
control <- trainControl(method='repeatedcv', 
                        number=10, 
                        repeats=3)
#Metric compare model is Accuracy
metric <- "Accuracy"
set.seed(123)
#Number randomely variable selected is mtry
mtry <- sqrt(ncol(x))

dataset1=dataset[, c(1:40, 61)]

tunegrid <- expand.grid(.mtry=mtry)
rf_default <- caret::train(Class~., 
                    data=dataset1, 
                    method='rf', 
                    metric='Accuracy', 
                    tuneGrid=tunegrid, 
                    trControl=control)
print(rf_default)

predictor <- Predictor$new(rf_default, data = dataset1[,1:40], y = dataset1$Class)
imp <- FeatureImp$new(predictor, loss = "ce")
imp
plot(imp)

Then change dataset1=dataset[, c(1:40, 61)] to dataset1=dataset[, c(1:10, 61)]
Regards

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FeatureImp: Problem with binary output #194

FeatureImp: Problem with binary output #194

snvv commented Jul 24, 2022 •

edited

Loading

FeatureImp: Problem with binary output #194

FeatureImp: Problem with binary output #194

Comments

snvv commented Jul 24, 2022 • edited Loading

snvv commented Jul 24, 2022 •

edited

Loading