Skip to content

Commit

Permalink
Merge pull request #23 from pat-alt/develop
Browse files Browse the repository at this point in the history
Develop
  • Loading branch information
pat-alt authored Oct 24, 2022
2 parents 7834ffb + b8cd444 commit 98c54a6
Show file tree
Hide file tree
Showing 4 changed files with 66 additions and 51 deletions.
10 changes: 5 additions & 5 deletions src/ConformalModels/inductive_classification.jl
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,10 @@ end
For the [`SimpleInductiveClassifier`](@ref) nonconformity scores are computed as follows:
``
S_i = s(X_i, Y_i) = h(X_i, Y_i), \ i \in \mathcal{D}_{\text{calibration}}
S_i^{\text{CAL}} = s(X_i, Y_i) = h(\hat\mu(X_i), Y_i), \ i \in \mathcal{D}_{\text{calibration}}
``
A typical choice for the heuristic function is ``h(X_i,Y_i)=1-\hat\mu(X_i)_{Y_i}`` where ``\hat\mu(X_i)_{Y_i}`` denotes the softmax output of the true class and ``\hat\mu`` denotes the model fitted on training data ``\mathcal{D}_{\text{train}}``. The simple approach only takes the softmax probability of the true label into account.
A typical choice for the heuristic function is ``h(\hat\mu(X_i), Y_i)=1-\hat\mu(X_i)_{Y_i}`` where ``\hat\mu(X_i)_{Y_i}`` denotes the softmax output of the true class and ``\hat\mu`` denotes the model fitted on training data ``\mathcal{D}_{\text{train}}``. The simple approach only takes the softmax probability of the true label into account.
"""
function MMI.fit(conf_model::SimpleInductiveClassifier, verbosity, X, y)

Expand All @@ -48,7 +48,7 @@ end
For the [`SimpleInductiveClassifier`](@ref) prediction sets are computed as follows,
``
\hat{C}_{n,\alpha}(X_{n+1}) = \left\{y: s(X_{n+1},y) \le \hat{q}_{n, \alpha}^{+} \{S_i\} \right\}, \ i \in \mathcal{D}_{\text{calibration}}
\hat{C}_{n,\alpha}(X_{n+1}) = \left\{y: s(X_{n+1},y) \le \hat{q}_{n, \alpha}^{+} \{S_i^{\text{CAL}}\} \right\}, \ i \in \mathcal{D}_{\text{calibration}}
``
where ``\mathcal{D}_{\text{calibration}}`` denotes the designated calibration data.
Expand Down Expand Up @@ -83,7 +83,7 @@ end
For the [`AdaptiveInductiveClassifier`](@ref) nonconformity scores are computed by cumulatively summing the ranked scores of each label in descending order until reaching the true label ``Y_i``:
``
S_i = s(X_i,Y_i) = \sum_{j=1}^k \hat\mu(X_i)_{\pi_j} \ \text{where } \ Y_i=\pi_k, i \in \mathcal{D}_{\text{calibration}}
S_i^{\text{CAL}} = s(X_i,Y_i) = \sum_{j=1}^k \hat\mu(X_i)_{\pi_j} \ \text{where } \ Y_i=\pi_k, i \in \mathcal{D}_{\text{calibration}}
``
"""
function MMI.fit(conf_model::AdaptiveInductiveClassifier, verbosity, X, y)
Expand Down Expand Up @@ -119,7 +119,7 @@ end
For the [`AdaptiveInductiveClassifier`](@ref) prediction sets are computed as follows,
``
\hat{C}_{n,\alpha}(X_{n+1}) = \left\{y: s(X_{n+1},y) \le \hat{q}_{n, \alpha}^{+} \{S_i\} \right\}, i \in \mathcal{D}_{\text{calibration}}
\hat{C}_{n,\alpha}(X_{n+1}) = \left\{y: s(X_{n+1},y) \le \hat{q}_{n, \alpha}^{+} \{S_i^{\text{CAL}}\} \right\}, i \in \mathcal{D}_{\text{calibration}}
``
where ``\mathcal{D}_{\text{calibration}}`` denotes the designated calibration data.
Expand Down
6 changes: 3 additions & 3 deletions src/ConformalModels/inductive_regression.jl
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,10 @@ end
For the [`SimpleInductiveRegressor`](@ref) nonconformity scores are computed as follows:
``
S_i = s(X_i, Y_i) = h(X_i, Y_i), \ i \in \mathcal{D}_{\text{calibration}}
S_i^{\text{CAL}} = s(X_i, Y_i) = h(\hat\mu(X_i), Y_i), \ i \in \mathcal{D}_{\text{calibration}}
``
A typical choice for the heuristic function is ``h(X_i,Y_i)=|Y_i-\hat\mu(X_i)|`` where ``\hat\mu`` denotes the model fitted on training data ``\mathcal{D}_{\text{train}}``.
A typical choice for the heuristic function is ``h(\hat\mu(X_i),Y_i)=|Y_i-\hat\mu(X_i)|`` where ``\hat\mu`` denotes the model fitted on training data ``\mathcal{D}_{\text{train}}``.
"""
function MMI.fit(conf_model::SimpleInductiveRegressor, verbosity, X, y)

Expand Down Expand Up @@ -48,7 +48,7 @@ end
For the [`SimpleInductiveRegressor`](@ref) prediction intervals are computed as follows,
``
\hat{C}_{n,\alpha}(X_{n+1}) = \hat\mu(X_{n+1}) \pm \hat{q}_{n, \alpha}^{+} \{S_i \}, \ i \in \mathcal{D}_{\text{calibration}}
\hat{C}_{n,\alpha}(X_{n+1}) = \hat\mu(X_{n+1}) \pm \hat{q}_{n, \alpha}^{+} \{S_i^{\text{CAL}} \}, \ i \in \mathcal{D}_{\text{calibration}}
``
where ``\mathcal{D}_{\text{calibration}}`` denotes the designated calibration data.
Expand Down
10 changes: 8 additions & 2 deletions src/ConformalModels/transductive_classification.jl
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,13 @@ end
@doc raw"""
MMI.fit(conf_model::NaiveClassifier, verbosity, X, y)
Wrapper function to fit the underlying MLJ model.
For the [`NaiveClassifier`](@ref) nonconformity scores are computed in-sample as follows:
``
S_i^{\text{IS}} = s(X_i, Y_i) = h(\hat\mu(X_i), Y_i), \ i \in \mathcal{D}_{\text{calibration}}
``
A typical choice for the heuristic function is ``h(\hat\mu(X_i), Y_i)=1-\hat\mu(X_i)_{Y_i}`` where ``\hat\mu(X_i)_{Y_i}`` denotes the softmax output of the true class and ``\hat\mu`` denotes the model fitted on training data ``\mathcal{D}_{\text{train}}``.
"""
function MMI.fit(conf_model::NaiveClassifier, verbosity, X, y)

Expand All @@ -35,7 +41,7 @@ end
For the [`NaiveClassifier`](@ref) prediction sets are computed as follows:
``
\hat{C}_{n,\alpha}(X_{n+1}) = \left\{y: s(X_{n+1},y) \le \hat{q}_{n, \alpha}^{+} \{1 - \hat\mu(X_i) \} \right\}, \ i \in \mathcal{D}_{\text{train}}
\hat{C}_{n,\alpha}(X_{n+1}) = \left\{y: s(X_{n+1},y) \le \hat{q}_{n, \alpha}^{+} \{S_i^{\text{IS}} \} \right\}, \ i \in \mathcal{D}_{\text{train}}
``
The naive approach typically produces prediction regions that undercover due to overfitting.
Expand Down
91 changes: 50 additions & 41 deletions src/ConformalModels/transductive_regression.jl
Original file line number Diff line number Diff line change
Expand Up @@ -19,13 +19,13 @@ end
@doc raw"""
MMI.fit(conf_model::NaiveRegressor, verbosity, X, y)
For the [`NaiveRegressor`](@ref) nonconformity scores are computed as follows:
For the [`NaiveRegressor`](@ref) nonconformity scores are computed in-sample as follows:
``
S_i = s(X_i, Y_i) = h(X_i, Y_i), \ i \in \mathcal{D}_{\text{train}}
S_i^{\text{IS}} = s(X_i, Y_i) = h(\hat\mu(X_i), Y_i), \ i \in \mathcal{D}_{\text{train}}
``
A typical choice for the heuristic function is ``h(X_i,Y_i)=|Y_i-\hat\mu(X_i)|`` where ``\hat\mu`` denotes the model fitted on training data ``\mathcal{D}_{\text{train}}``.
A typical choice for the heuristic function is ``h(\hat\mu(X_i),Y_i)=|Y_i-\hat\mu(X_i)|`` where ``\hat\mu`` denotes the model fitted on training data ``\mathcal{D}_{\text{train}}``.
"""
function MMI.fit(conf_model::NaiveRegressor, verbosity, X, y)

Expand All @@ -47,7 +47,7 @@ end
For the [`NaiveRegressor`](@ref) prediction intervals are computed as follows:
``
\hat{C}_{n,\alpha}(X_{n+1}) = \hat\mu(X_{n+1}) \pm \hat{q}_{n, \alpha}^{+} \{S_i \}, \ i \in \mathcal{D}_{\text{train}}
\hat{C}_{n,\alpha}(X_{n+1}) = \hat\mu(X_{n+1}) \pm \hat{q}_{n, \alpha}^{+} \{S_i^{\text{IS}} \}, \ i \in \mathcal{D}_{\text{train}}
``
The naive approach typically produces prediction regions that undercover due to overfitting.
Expand Down Expand Up @@ -76,7 +76,14 @@ end
@doc raw"""
MMI.fit(conf_model::JackknifeRegressor, verbosity, X, y)
Wrapper function to fit the underlying MLJ model.
For the [`JackknifeRegressor`](@ref) nonconformity scores are computed through a leave-one-out (LOO) procedure as follows,
``
S_i^{\text{LOO}} = s(X_i, Y_i) = h(\hat\mu_{-i}(X_i), Y_i), \ i \in \mathcal{D}_{\text{train}}
``
where ``\hat\mu_{-i}(X_i)`` denotes the leave-one-out prediction for ``X_i``. In other words, for each training instance ``i=1,...,n`` the model is trained on all training data excluding ``i``. The fitted model is then used to predict out-of-sample from ``X_i``. The corresponding nonconformity score is then computed by applying a heuristic uncertainty measure ``h(\cdot)`` to the fitted value ``\hat\mu_{-i}(X_i)`` and the true value ``Y_i``.
"""
function MMI.fit(conf_model::JackknifeRegressor, verbosity, X, y)

Expand Down Expand Up @@ -108,10 +115,10 @@ end
For the [`JackknifeRegressor`](@ref) prediction intervals are computed as follows,
``
\hat{C}_{n,\alpha}(X_{n+1}) = \hat\mu(X_{n+1}) \pm \hat{q}_{n, \alpha}^{+} \{|Y_i - \hat\mu_{-i}(X_i)|\}, \ i \in \mathcal{D}_{\text{train}}
\hat{C}_{n,\alpha}(X_{n+1}) = \hat\mu(X_{n+1}) \pm \hat{q}_{n, \alpha}^{+} \{S_i^{\text{LOO}}\}, \ i \in \mathcal{D}_{\text{train}}
``
where ``\hat\mu_{-i}`` denotes the model fitted on training data with ``i``th point removed. The jackknife procedure addresses the overfitting issue associated with the [`NaiveRegressor`](@ref).
where ``S_i^{\text{LOO}}`` denotes the nonconformity that is generated as explained in [`fit(conf_model::JackknifeRegressor, verbosity, X, y)`](@ref). The jackknife procedure addresses the overfitting issue associated with the [`NaiveRegressor`](@ref).
"""
function MMI.predict(conf_model::JackknifeRegressor, fitresult, Xnew)
= MMI.predict(conf_model.model, fitresult, MMI.reformat(conf_model.model, Xnew)...)
Expand All @@ -135,9 +142,15 @@ function JackknifePlusRegressor(model::Supervised; coverage::AbstractFloat=0.95,
end

@doc raw"""
MMI.fit(conf_model::JackknifeRegressor, verbosity, X, y)
MMI.fit(conf_model::JackknifePlusRegressor, verbosity, X, y)
For the [`JackknifePlusRegressor`](@ref) nonconformity scores are computed in the same way as for the [`JackknifeRegressor`](@ref). Specifically, we have,
``
S_i^{\text{LOO}} = s(X_i, Y_i) = h(\hat\mu_{-i}(X_i), Y_i), \ i \in \mathcal{D}_{\text{train}}
``
Wrapper function to fit the underlying MLJ model.
where ``\hat\mu_{-i}(X_i)`` denotes the leave-one-out prediction for ``X_i``. In other words, for each training instance ``i=1,...,n`` the model is trained on all training data excluding ``i``. The fitted model is then used to predict out-of-sample from ``X_i``. The corresponding nonconformity score is then computed by applying a heuristic uncertainty measure ``h(\cdot)`` to the fitted value ``\hat\mu_{-i}(X_i)`` and the true value ``Y_i``.
"""
function MMI.fit(conf_model::JackknifePlusRegressor, verbosity, X, y)

Expand Down Expand Up @@ -174,13 +187,7 @@ end
For the [`JackknifePlusRegressor`](@ref) prediction intervals are computed as follows,
``
\hat{C}_{n,\alpha}(X_{n+1}) = \left[ \hat{q}_{n, \alpha}^{-} \{\hat\mu_{-i}(X_{n+1}) - R_i^{\text{LOO}} \}, \hat{q}_{n, \alpha}^{+} \{\hat\mu_{-i}(X_{n+1}) + R_i^{\text{LOO}}\} \right] , i \in \mathcal{D}_{\text{train}}
``
with
``
R_i^{\text{LOO}}=|Y_i - \hat\mu_{-i}(X_i)|
\hat{C}_{n,\alpha}(X_{n+1}) = \left[ \hat{q}_{n, \alpha}^{-} \{\hat\mu_{-i}(X_{n+1}) - S_i^{\text{LOO}} \}, \hat{q}_{n, \alpha}^{+} \{\hat\mu_{-i}(X_{n+1}) + S_i^{\text{LOO}}\} \right] , i \in \mathcal{D}_{\text{train}}
``
where ``\hat\mu_{-i}`` denotes the model fitted on training data with ``i``th point removed. The jackknife``+`` procedure is more stable than the [`JackknifeRegressor`](@ref).
Expand Down Expand Up @@ -215,7 +222,13 @@ end
@doc raw"""
MMI.fit(conf_model::JackknifeMinMaxRegressor, verbosity, X, y)
Wrapper function to fit the underlying MLJ model.
For the [`JackknifeMinMaxRegressor`](@ref) nonconformity scores are computed in the same way as for the [`JackknifeRegressor`](@ref). Specifically, we have,
``
S_i^{\text{LOO}} = s(X_i, Y_i) = h(\hat\mu_{-i}(X_i), Y_i), \ i \in \mathcal{D}_{\text{train}}
``
where ``\hat\mu_{-i}(X_i)`` denotes the leave-one-out prediction for ``X_i``. In other words, for each training instance ``i=1,...,n`` the model is trained on all training data excluding ``i``. The fitted model is then used to predict out-of-sample from ``X_i``. The corresponding nonconformity score is then computed by applying a heuristic uncertainty measure ``h(\cdot)`` to the fitted value ``\hat\mu_{-i}(X_i)`` and the true value ``Y_i``.
"""
function MMI.fit(conf_model::JackknifeMinMaxRegressor, verbosity, X, y)

Expand Down Expand Up @@ -252,13 +265,7 @@ end
For the [`JackknifeMinMaxRegressor`](@ref) prediction intervals are computed as follows,
``
\hat{C}_{n,\alpha}(X_{n+1}) = \left[ \min_{i=1,...,n} \hat\mu_{-i}(X_{n+1}) - \hat{q}_{n, \alpha}^{+} \{R_i^{\text{LOO}} \}, \max_{i=1,...,n} \hat\mu_{-i}(X_{n+1}) + \hat{q}_{n, \alpha}^{+} \{ R_i^{\text{LOO}}\} \right] , i \in \mathcal{D}_{\text{train}}
``
with
``
R_i^{\text{LOO}}=|Y_i - \hat\mu_{-i}(X_i)|
\hat{C}_{n,\alpha}(X_{n+1}) = \left[ \min_{i=1,...,n} \hat\mu_{-i}(X_{n+1}) - \hat{q}_{n, \alpha}^{+} \{S_i^{\text{LOO}} \}, \max_{i=1,...,n} \hat\mu_{-i}(X_{n+1}) + \hat{q}_{n, \alpha}^{+} \{S_i^{\text{LOO}}\} \right] , i \in \mathcal{D}_{\text{train}}
``
where ``\hat\mu_{-i}`` denotes the model fitted on training data with ``i``th point removed. The jackknife-minmax procedure is more conservative than the [`JackknifePlusRegressor`](@ref).
Expand Down Expand Up @@ -296,7 +303,13 @@ end
@doc raw"""
MMI.fit(conf_model::CVPlusRegressor, verbosity, X, y)
Wrapper function to fit the underlying MLJ model.
For the [`CVPlusRegressor`](@ref) nonconformity scores are computed though cross-validation (CV) as follows,
``
S_i^{\text{CV}} = s(X_i, Y_i) = h(\hat\mu_{-\mathcal{D}_{k(i)}}(X_i), Y_i), \ i \in \mathcal{D}_{\text{train}}
``
where ``\hat\mu_{-\mathcal{D}_{k(i)}}(X_i)`` denotes the CV prediction for ``X_i``. In other words, for each CV fold ``k=1,...,K`` and each training instance ``i=1,...,n`` the model is trained on all training data excluding the fold containing ``i``. The fitted model is then used to predict out-of-sample from ``X_i``. The corresponding nonconformity score is then computed by applying a heuristic uncertainty measure ``h(\cdot)`` to the fitted value ``\hat\mu_{-\mathcal{D}_{k(i)}}(X_i)`` and the true value ``Y_i``.
"""
function MMI.fit(conf_model::CVPlusRegressor, verbosity, X, y)

Expand Down Expand Up @@ -338,19 +351,15 @@ end
@doc raw"""
MMI.predict(conf_model::CVPlusRegressor, fitresult, Xnew)
For the [`CVPlusRegressor`](@ref) prediction intervals are computed as follows,
For the [`CVPlusRegressor`](@ref) prediction intervals are computed in much same way as for the [`JackknifePlusRegressor`](@ref). Specifically, we have,
``
\hat{C}_{n,\alpha}(X_{n+1}) = \left[ \hat{q}_{n, \alpha}^{-} \{\hat\mu_{-\mathcal{D}_{k(i)}}(X_{n+1}) - R_i^{\text{CV}} \}, \hat{q}_{n, \alpha}^{+} \{\hat\mu_{-\mathcal{D}_{k(i)}}(X_{n+1}) + R_i^{\text{CV}}\} \right] , \ i \in \mathcal{D}_{\text{train}}
\hat{C}_{n,\alpha}(X_{n+1}) = \left[ \hat{q}_{n, \alpha}^{-} \{\hat\mu_{-\mathcal{D}_{k(i)}}(X_{n+1}) - S_i^{\text{CV}} \}, \hat{q}_{n, \alpha}^{+} \{\hat\mu_{-\mathcal{D}_{k(i)}}(X_{n+1}) + S_i^{\text{CV}}\} \right] , \ i \in \mathcal{D}_{\text{train}}
``
with
``
R_i^{\text{CV}}=|Y_i - \hat\mu_{-\mathcal{D}_{k(i)}}(X_i)|
``
where ``\hat\mu_{-\mathcal{D}_{k(i)}}`` denotes the model fitted on training data with fold ``\mathcal{D}_{k(i)}`` that contains the ``i`` th point removed.
where ``\hat\mu_{-\mathcal{D}_{k(i)}}`` denotes the model fitted on training data with subset ``\mathcal{D}_{k(i)}`` that contains the ``i`` th point removed.
The [`JackknifePlusRegressor`](@ref) is a special case of the [`CVPlusRegressor`](@ref) for which ``K=n``.
"""
function MMI.predict(conf_model::CVPlusRegressor, fitresult, Xnew)
# Get all LOO predictions for each Xnew:
Expand Down Expand Up @@ -387,7 +396,13 @@ end
@doc raw"""
MMI.fit(conf_model::CVMinMaxRegressor, verbosity, X, y)
Wrapper function to fit the underlying MLJ model.
For the [`CVMinMaxRegressor`](@ref) nonconformity scores are computed in the same way as for the [`CVPlusRegressor`](@ref). Specifically, we have,
``
S_i^{\text{CV}} = s(X_i, Y_i) = h(\hat\mu_{-\mathcal{D}_{k(i)}}(X_i), Y_i), \ i \in \mathcal{D}_{\text{train}}
``
where ``\hat\mu_{-\mathcal{D}_{k(i)}}(X_i)`` denotes the CV prediction for ``X_i``. In other words, for each CV fold ``k=1,...,K`` and each training instance ``i=1,...,n`` the model is trained on all training data excluding the fold containing ``i``. The fitted model is then used to predict out-of-sample from ``X_i``. The corresponding nonconformity score is then computed by applying a heuristic uncertainty measure ``h(\cdot)`` to the fitted value ``\hat\mu_{-\mathcal{D}_{k(i)}}(X_i)`` and the true value ``Y_i``.
"""
function MMI.fit(conf_model::CVMinMaxRegressor, verbosity, X, y)

Expand Down Expand Up @@ -433,13 +448,7 @@ end
For the [`CVMinMaxRegressor`](@ref) prediction intervals are computed as follows,
``
\hat{C}_{n,\alpha}(X_{n+1}) = \left[ \min_{i=1,...,n} \hat\mu_{-\mathcal{D}_{k(i)}}(X_{n+1}) - \hat{q}_{n, \alpha}^{+} \{R_i^{\text{CV}} \}, \max_{i=1,...,n} \hat\mu_{-\mathcal{D}_{k(i)}}(X_{n+1}) + \hat{q}_{n, \alpha}^{+} \{ R_i^{\text{CV}}\} \right] , i \in \mathcal{D}_{\text{train}}
``
with
``
R_i^{\text{CV}}=|Y_i - \hat\mu_{-\mathcal{D}_{k(i)}}(X_i)|
\hat{C}_{n,\alpha}(X_{n+1}) = \left[ \min_{i=1,...,n} \hat\mu_{-\mathcal{D}_{k(i)}}(X_{n+1}) - \hat{q}_{n, \alpha}^{+} \{S_i^{\text{CV}} \}, \max_{i=1,...,n} \hat\mu_{-\mathcal{D}_{k(i)}}(X_{n+1}) + \hat{q}_{n, \alpha}^{+} \{ S_i^{\text{CV}}\} \right] , i \in \mathcal{D}_{\text{train}}
``
where ``\hat\mu_{-\mathcal{D}_{k(i)}}`` denotes the model fitted on training data with subset ``\mathcal{D}_{k(i)}`` that contains the ``i`` th point removed.
Expand Down

0 comments on commit 98c54a6

Please sign in to comment.