-
Notifications
You must be signed in to change notification settings - Fork 38
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question and documentation #13
Comments
Hi, Thanks for your questions and comment! Here are my answers in order:
Best regards, |
Can you clarify predict_p and predict_set. If you have a predict_p that outputs [0.46552707, 0.04407598] - as in your example, shouldn't the predict_set output be [0, 1] (in your example it is [1, 0])? I assume predict_set is predicting the class labels, and thus the second class with a p-value of 0.04407598 is under the 95% confidence interval? Thank you. |
predict_set provides the labels that cannot be rejected at the chosen confidence level; 1 indicates the presence of the corresponding label in the prediction set, i.e. it has not been rejected. Best regards, |
@henrikbostrom thanks for your answer. Where I come from 'calibrating a model in probabilities' means changing the output of the model so that it matches historical probabilities. (think probability calibration curve, usually dealt with things like isotonic regression). Maybe this is a cultural thing... as I understand it, it can be translated into p-values calibration. It seemed to me that conformal prediction would allows something like this. At least the Venn-Abers approach seems to offer something similar, based on the metric used (see discussion). I was wondering 1) if there would be a similar approach here to build an optimal prediction to account for the calculated p-values and 2) if this would depend on the metric used as in the Venn-Abers approach. Regarding my second question, I was trying to evaluate the performance gain of the calibration process. And tried to compare some random forest performance with and without the wrapper. As the predict_proba method is not modified in the sense I expected, the experiment is a bit void (results depends on the seeds and the vanilla rf having access to the whole train data). |
Thanks for the clarification! Venn-Abers predictors would indeed be a natural choice for obtaining calibrated class probabilities; these are not (currently) implemented in the package. When evaluating the output of predict_proba (which is not affected by the calibrate method) one would indeed expect a better performance from fitting the full (rather than only the proper) training set. Best regards, |
I will move this thread to "Discussions" (the proposed documentation change has been fixed). |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
I have some relatively beginer questions after trying the intro code:
for the code to work out of the box.
The text was updated successfully, but these errors were encountered: