Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX bug #462: Decoupling infinite interval production and quantile calculation #463

Merged
merged 9 commits into from
Jun 14, 2024

Conversation

thibaultcordier
Copy link
Collaborator

@thibaultcordier thibaultcordier commented Jun 11, 2024

Description

Decoupling infinite interval production and quantile calculation

Fixes #462

Type of change

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

  • Test when alpha and the number of effective calibration samples are compatible
  • Test that the production of infinite intervals and the calculation of quantiles are properly decoupled

Checklist

  • I have read the contributing guidelines
  • I have updated the HISTORY.rst and AUTHORS.rst files
  • Linting passes successfully : make lint
  • Typing passes successfully : make type-check
  • Unit tests pass successfully : make tests
  • Coverage is 100% : make coverage
  • Documentation builds successfully : make doc

@thibaultcordier thibaultcordier self-assigned this Jun 11, 2024
@thibaultcordier thibaultcordier added Bug Type: bug Source: developers Proposed by developers. labels Jun 11, 2024
@thibaultcordier thibaultcordier marked this pull request as ready for review June 13, 2024 15:13
Copy link
Collaborator

@LacombeLouis LacombeLouis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @thibaultcordier,
I think the PR is great! Thank you for including the examples! This PR could be approved as is, but there are some small changes that I think could help make it clearer. Especially in the example, I don't think it's needed to add the prefit_asym part.

Thank you!



##############################################################################
# Experiment 2: Again but without fixing random_state
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would only keep the experiment 2 where you fix the random_state.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to keep this part, because the reader will no longer understand why in the next plot, the different methods no longer overlap when they did before.

@@ -412,6 +412,29 @@ def check_gamma(
)


def get_effective_calibration_samples(scores: NDArray, sym: bool):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the sym not by default True?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sym is by default True in ConformityScore. In any case, we need to compute the effective calibration sample size to compute the quantile.

@@ -369,6 +384,11 @@ def get_bounds(

By default ``False``.

allow_infinite_bounds: bool
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we call it allow_infinite_bounds and the unbounded?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In MapieRegressor, it is helpful to keep allow_infinite_bounds to understand what it is. In get_bounds, it is not helpful to understand the code because too long.



##############################################################################
# Experiment 4: Extensive experimentation on different delta and n_calib
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would only show the reference, prefit.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is important to show the asymmetric case to avoid any misunderstanding/publication of an issue that is not an issue.



##############################################################################
# Section 2: Comparison with different MAPIE CP methods
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure that this example adds a lot of insights. I would make this example shorter by getting rid of the prefit_asym part.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice, this part could be improved by adding more MAPIE methods (like split, CV+, CQR etc). In any case, we let the user test it themselves.

@thibaultcordier thibaultcordier merged commit 9829a37 into master Jun 14, 2024
8 checks passed
@Valentin-Laurent Valentin-Laurent deleted the 462-coverage-validity-2 branch November 12, 2024 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Type: bug Source: developers Proposed by developers.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Coverage validity not verified in MapieRegressor when producing infinite intervals
2 participants