Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FeatureImp + mlr3 Learner that predicts probabilities does not work #134

Open
giuseppec opened this issue Jul 15, 2020 · 9 comments
Open
Labels

Comments

@giuseppec
Copy link
Owner

giuseppec commented Jul 15, 2020

If I want to compute the importance for a measure based on probabilities (e.g., brier score), the FeatureImp is never calculated on the probabilities, even if I manually use a predict.function:

library("mlr3")
library("iml")
credit.task = tsk("german_credit")
lrn = lrn("classif.rpart", predict_type = "prob")
model = lrn$train(credit.task)
data = credit.task$data()

# write a measure that just prints the `predicted` that will be used to calculate the measure
measure_print_predicted = function(actual, predicted) {
  cat(head(predicted)) # have a look at how predicted looks like
}

pred = Predictor$new(model, data = data, y = "credit_risk")
imp = FeatureImp$new(pred, loss = measure_print_predicted, n.repetitions = 1)
# 1 2 1 2 2 1

It seems that internally the class is converted as numeric values (1 and 2), which makes it impossible to compute measures based on probabilities. I then tried to directly use a manually written predict.function which also did not work:

# use a manually written predict function that returns probabilities
predict_good_prob = function(model, newdata) predict(model, newdata, predict_type = "prob")[, "good"]
head(predict_good_prob(model, data))

# use this predict function for IML method
pred = Predictor$new(model, data = data, y = "credit_risk", predict.function = predict_good_prob)
imp = FeatureImp$new(pred, loss = measure_print_predicted, n.repetitions = 1)
# 1 2 1 2 2 1
@giuseppec giuseppec added the bug label Jul 15, 2020
@giuseppec
Copy link
Owner Author

giuseppec commented Jul 28, 2020

Ah, I need to define the positve class, I could solve the issue using

pred = Predictor$new(model, data = data.train, y = "credit_risk", class = "good")

Maybe printing an error message could help here?
EDIT: there seem to be further issues, see below #134 (comment)

@pat-s
Copy link
Collaborator

pat-s commented Jul 29, 2020

Thanks for reporting!

The underlying issue is that we do not have information about the task type in Predictor$new() and all subsequent steps since only the model is supplied by the user.

I've tried to partially address this in #137 by checking if the supplied learner has attributes of a classification learner. I only did so for mlr3 learners right now though.

This should probably be tackled in a more robust fashion but it does the job for now.

@pat-s
Copy link
Collaborator

pat-s commented Jul 29, 2020

I was wrong, class should not be used in any occasion with 0/1 classif tasks, whether these are prob or response with respect to predict_type. It is for multiclass tasks.

What is correct is that {iml} by default does not know about the predict_type.
The user can pass it via argument type when creating the Predictor.

Also your loss function does not seem to be suited here? With the default "ce" I got your example executing fine.

    library("mlr3")
    library("iml")
    credit.task = tsk("german_credit")
    lrn = lrn("classif.rpart", predict_type = "prob")
    model = lrn$train(credit.task)
    data = credit.task$data()

    pred = Predictor$new(model, data = data, y = "credit_risk", type = "prob")
    FeatureImp$new(pred, loss = "ce", n.repetitions = 1)
    #> Interpretation method:  FeatureImp 
    #> error function: ce
    #> 
    #> Analysed predictor: 
    #> Prediction task: classification 
    #> Classes:  
    #> 
    #> Analysed data:
    #> Sampling from data.frame with 1000 rows and 20 columns.
    #> 
    #> Head of results:
    #>          feature importance.05 importance importance.95 permutation.error
    #> 1         status      1.517241   1.517241      1.517241             0.308
    #> 2       duration      1.295567   1.295567      1.295567             0.263
    #> 3         amount      1.157635   1.157635      1.157635             0.235
    #> 4 credit_history      1.137931   1.137931      1.137931             0.231
    #> 5        purpose      1.098522   1.098522      1.098522             0.223
    #> 6        savings      1.098522   1.098522      1.098522             0.223

<sup>Created on 2020-07-29 by the [reprex package](https://reprex.tidyverse.org) (v0.3.0.9001)</sup>

@giuseppec
Copy link
Owner Author

giuseppec commented Jul 30, 2020

I see, type = "prob" makes sure that probabilities are used but actual will still be a factor? To compute the ce of course probabilities are not necessary. Your code won't work if you use loss = "mse" (which should be the same as the brier score in case of probabilities) and also does not work if I use loss = "logLoss" because actual is a factor?

Example:

library("mlr3")
library("iml")
credit.task = tsk("german_credit")
lrn = lrn("classif.rpart", predict_type = "prob")
model = lrn$train(credit.task)
data = credit.task$data()

brier = function(actual, predicted) {
  sum((actual - predicted)^2)
}

pred = Predictor$new(model, data = data, y = "credit_risk", type = "prob", class = "good")
FeatureImp$new(pred, loss = brier, n.repetitions = 1)

## Fehler in if (self$original.error == 0 & self$compare == "ratio") { : 
##  Fehlender Wert, wo TRUE/FALSE nötig ist
## Zusätzlich: Warnmeldung:
## In Ops.factor(actual, predicted) : ‘-’ not meaningful for factors

To unterstand why this happens, let's look at the values actual and predicted that a measure will have access to:

library("mlr3")
library("iml")
credit.task = tsk("german_credit")
lrn = lrn("classif.rpart", predict_type = "prob")
model = lrn$train(credit.task)
data = credit.task$data()

measure_print = function(actual, predicted) {
  cat(head(actual), fill = T)
  cat(head(predicted), fill = T)
}

pred = Predictor$new(model, data = data, y = "credit_risk", type = "prob", class = "good")
FeatureImp$new(pred, loss = measure_print, n.repetitions = 1)

## 1 2 1 1 2 1
## 0.8767123 0.1388889 0.868709 0.379562 0.379562 0.868709
## Fehler in if (self$original.error == 0 & self$compare == "ratio") { : 
##  Argument hat Länge 0

# PS: if I don't use class = "good", the value of "actual" from the measure is still a factor:
pred = Predictor$new(model, data = data, y = "credit_risk", type = "prob")
FeatureImp$new(pred, loss = measure_print, n.repetitions = 1)

## 1 2 1 1 2 1
## 1 2 1 2 2 1
## Fehler in if (self$original.error == 0 & self$compare == "ratio") { : 
##  Argument hat Länge 0

@giuseppec giuseppec added bug and removed enhancement labels Jul 30, 2020
@pat-s
Copy link
Collaborator

pat-s commented Jul 30, 2020

Thanks. Yes something bad is happening internally.

But in any case, class should not be needed here (for non-multiclass tasks) and might just by chance (partly) help here.

@giuseppec
Copy link
Owner Author

Maybe if iml allows that the loss argument can also be a mlr3measure (only when model is a mlr3 model), we might get rid of these bugs because these things are already implemented in mlr3 anyway?
Because now you mentioned multiclass I can probably already foresee further issues (e.g. how do I compute the importance with multiclass auc? At first glance, it does not seem to be possible to define a loss function that uses / has access to the probabilities of all classes at the same time; anyway this would be a separate issue)

@christophM
Copy link
Collaborator

Thanks for addressing this.

It is quite difficult to capture many possible losses and also the different types of outcomes (regression, binare classification, multiclass, probabilities).
For the actual value of the prediction, the raw y is taken as provided. Currently if the users want something different than just the factor (and probabilities instead), they have to provide the y in a different form in Predictor$new().
So for the Brier score example this would be:

library("mlr3")
library("iml")
credit.task = tsk("german_credit")
lrn = lrn("classif.rpart", predict_type = "prob")
model = lrn$train(credit.task)
data = credit.task$data()

brier = function(actual, predicted) {
  sum((actual - predicted)^2)
}

y = 1 * (credit.task$data()$credit_risk == "good")

pred = Predictor$new(model, data = data, y = y, type = "prob", class = "good")
FeatureImp$new(pred, loss = brier, n.repetitions = 1)
                                                            

@christophM
Copy link
Collaborator

I am not sure how to improve the situation while still allowing very general settings for the loss function.
Especially for user-provided loss functions, we have no way of knowing how actual target has to look like (factor, 0/1 coding, multi-class 0/1, ...)

Maybe having some more examples would have helped in the help file?

@giuseppec
Copy link
Owner Author

giuseppec commented Aug 10, 2020

My problem was that I passed my own predict.fun which was ignored completely:

library("mlr3")
library("iml")
credit.task = tsk("german_credit")
lrn = lrn("classif.rpart", predict_type = "prob")
model = lrn$train(credit.task)
data = credit.task$data()

# print actual and predicted
measure_print = function(actual, predicted) {
  cat(head(actual), fill = T)
  cat(head(predicted), fill = T)
}

# use a manually written predict function that returns probabilities
predict_good_prob = function(model, newdata) predict(model, newdata, predict_type = "prob")[, "good"]
head(predict_good_prob(model, data))
# [1] 0.8767123 0.1388889 0.8687090 0.3795620 0.3795620 0.8687090

pred = Predictor$new(model, data = data, y = "credit_risk", predict.function = predict_good_prob)
imp = FeatureImp$new(pred, loss = measure_print, n.repetitions = 1)
# 1 2 1 1 2 1
# 1 2 1 2 2 1
# Fehler in if (self$original.error == 0 & self$compare == "ratio") { : 
#  Argument hat Länge 0

Usually, the user knows how actual looks like (its just the target column of the data), right?
Also the user should know how the output of the predict function looks like (yes, different models sometimes output different things here, sometimes matrices, sometimes vectors etc.).
In the code above, I was suprised that my self-written predict function was ignored, since the measure function seems to have access only to the factor (see output above). That is, I had no chance to access the probabilities of the class "good" nor the probabilities of "bad" in the measure function.

In case of multiclass I could have written a predict function that passes a matrix of all probabilities for each class, e.g.:

predict_good_prob = function(model, newdata) predict(model, newdata, predict_type = "prob")
head(predict_good_prob(model, data))
#           good       bad
# [1,] 0.8767123 0.1232877
# [2,] 0.1388889 0.8611111
# [3,] 0.8687090 0.1312910
pred = Predictor$new(model, data = data, y = "credit_risk", predict.function = predict_good_prob)

Then, I'd expect that I can reuse this matrix of probabilities in the measure

# print actual and predicted
my_cool_measure = function(actual, predicted) {
 class1 = predicted[,"good"]
 class2 = predicted[,"bad"]
# do some cool computations with probabilities of each class
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants