-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First draft of multi-objective optimization #1455
Conversation
Co-authored-by: Katharina Eggensperger <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need to change, just if you see it and get a chance :) I think it looks a little Cleaner with as little square brackets []
from Optional[]
and Union[]
as possible but I'd merge without it. It also translates better with types in sphinx docs if you use the same notation in the docstrings.
autosklearn/automl.py
Outdated
@@ -210,7 +210,7 @@ def __init__( | |||
get_smac_object_callback: Optional[Callable] = None, | |||
smac_scenario_args: Optional[Mapping] = None, | |||
logging_config: Optional[Mapping] = None, | |||
metric: Optional[Scorer] = None, | |||
metric: Optional[Scorer | Sequence[Scorer]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not neccessary, just something to know Optional[X] == Union[X, None] == X | None
i..e you could write Scorer | Sequence[Scorer] | None = None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's look nice, will do.
autosklearn/evaluation/__init__.py
Outdated
@@ -86,10 +99,10 @@ def fit_predict_try_except_decorator( | |||
|
|||
|
|||
def get_cost_of_crash( | |||
metric: Union[Scorer, List[Scorer], Tuple[Scorer]] | |||
metric: Union[Scorer | Sequence[Scorer]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like wise here, Union[X | Y] == X | Y
, the |
essentially is just the infix operator for Union
in the same way you have +
instead of add(x, y)
.
i.e. metric: Scorer | Sequence[Scorer]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, more files I didn't see. I'm starting to wonder if it makes sense to have something like a MetricGroup
class? A lot of the code changes seem to just be handling that case of 1
, many
or None
metrics.
val_score = metric._optimum - (metric._sign * run_value.cost) | ||
cost = run_value.cost | ||
if not isinstance(self._metric, Scorer): | ||
cost = cost[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is a point of API conflict? It would be good to know about all the metrics for a model but at the end of the day, we currently only support one and so we choose the first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it would be good to know about all the metrics. I will look into returning multiple metrics here (should be possible).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see my comment wrt this in the PR comment at the top.
Codecov Report
@@ Coverage Diff @@
## development #1455 +/- ##
=============================================
Coverage 84.31% 84.32%
=============================================
Files 147 147
Lines 11284 11397 +113
Branches 1934 1986 +52
=============================================
+ Hits 9514 9610 +96
- Misses 1256 1263 +7
- Partials 514 524 +10 |
Alright, I think the functionality of this PR is complete for now. I will add unit tests after another round of feedback. |
Replies to the list of items in the PR first.
Sounds good, we need to know what to do with metrics in the ensemble first.
An extended version of the first solution,
Same answer, without a method to present choices we have the requirement to select for the user. I would go with the above solution
Seems good, we could even extract the metrics name through
Sure
In theory there's nothing that prevents us from making a selection at [
((cost_0, cost_1, ...), ens0),
((cost_0, cost_1, ...), ens1),
((cost_0, cost_1, ...), ens2),
] I'll do a review now |
solution=y_true, | ||
prediction=y_pred, | ||
task_type=BINARY_CLASSIFICATION, | ||
metrics=[autosklearn.metrics.accuracy, autosklearn.metrics.accuracy], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Non-Blocking, Q]
Using the same metric twice seems like it should raise an error to me. Given that the behavior is no error raised, the output is good though. Should we keep it as is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean we should allow the same metric to be present in both metrics
and scoring_functions
, but not the same metric twice in one of them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess so, I didn't really think about the same one twice, once in metrics
and once in scoring_functions
.
What you stated seems reasonable to me but not a cause to block the PR if it's difficult. I can't imagine a scenario where you would want the same metric, i.e. acc
twice in metrics
but I could imagine a scenario where you have acc
in metrics
and acc
in scoring_functions
.
The scenario where I see this being used is purely to make getting out scores for autosklearn to be done in one place, i.e. you specify acc
and metric_z
for the optimization with metrics
and you specify acc
, balanced_acc
and f1_score
for the scoring_functions
when you later want to evaluate the autosklearn run.
* First draft of multi-objective optimization Co-authored-by: Katharina Eggensperger <[email protected]> * Feedback from Eddie * Make metric internally always a list * Fix most examples * Take further feedback into account * Fix unit tests * Fix one more example * Add multi-objective example * Simplify internal interface * Act on further feedback * Fix bug * Update cv_results_ for multi-objective sklearn compliance * Update leaderboard for multi-objective optimization * Include Feedback from Katharina * Take offline feedback into account * Take offline feedback into account * Eddie's feedback * Fix metadata generation unit test * Test for metrics with the same name * Fix? * Test CV results * Test leaderboard for multi-objective optimization * Last batch of unit tests added * Include Eddie's feedback Co-authored-by: Katharina Eggensperger <[email protected]>
This is the very first implementation of multi-objective optimization in Auto-sklearn, solving issue #1317. It allows users passing in a list of metrics instead of a single metric. Auto-sklearn then solves a multi-objective optimization problem by using SMAC's new multi-objective capabilities.
This PR has several known limitation, some of which can/will be fixed in a future PR, and some for which it is unclear how to approach them in the first place:
This will be addressed in a future PR.
_load_best_individual_model()
returns the best model according to the first metric:It is unclear how to make this multi-objective as this is a clear fallback solution. My only two suggestions would be 1) to always return the first metric and document that the first metric is the tie breaking metric (currently implemented), or 2) use the Copula mixing approach demonstrated in Figure 6 of http://proceedings.mlr.press/v119/salinas20a/salinas20a.pdf to return a scalarized solution. However, this can lead to unpredictable selections.
score()
:This function suffers from the same issue as
load_best_individual_model
as there is no clear definition which model is the best._get_runhistory_models_performance()
:This function suffers from the same issue as
load_best_individual_model
as there is no clear definition which model is the best.sprint_statistics()
This function suffers from the same issue as
load_best_individual_model
as there is no clear definition which model is the best.refit()
This function suffers from the same issue as
load_best_individual_model
as there is no clear definition which model is the best.Updated to follow the output of scikit-learn, see Attributes in here
It is so far completely unclear how to achieve this. Picking a model from the Pareto front would make Auto-sklearn non-interactive. Therefore, it might be a good idea to add a function to return all ensembles on the Pareto front as "raw" ensemble objects that can be further used by the user, or to one after the other load different models as the main model.
TODOs: