[WIP] Add chosen metric argument to clarify early stopping behaviour #6424

sami-ka · 2024-04-20T21:25:16Z

Related to issue #6423 (and also #6223).

When mixing callable functions and string aliases in evaluation metrics and/or objective function, the first metric flag only selects the first built in metric if any, and after looks to callable functions as metrics.
It comes from the inner_eval function inner_eval function that starts with any built-in metrics if there is any.

Selecting a custom metric for early stopping made me do adjustments to make it work.

The builtin metric can come from the LGBMRegressor instantiation, the built in objective function, or a str in the eval_metric list used in the fit method.

In order to have something that is easier to control, people should have either :

objective and metrics coming from string
objective and metrics coming from callable

The Booster API will handle some of the cases I mentionned above but not all, and the bug will also arise.

Since the built in metrics can come from anywhere, and it's hard to know which one is the first metric when calling the inner_eval function, I suggest to have an additional argument indicating the name of the chosen metric the early stopping callback should use.

Feel free to make any comment on this PR. I did not manage to check the C part of the code, but I want some feedback on the idea first before spending some time on making my local setup work for this !

…ping callback through parameters

sami-ka · 2024-04-20T21:58:28Z

@sami-ka please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.
@microsoft-github-policy-service agree [company="{your company}"]
Options:

(default - no company specified) I have sole ownership of intellectual property rights to my Submissions and I am not making Submissions in the course of work for my employer.
@microsoft-github-policy-service agree
(when company given) I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer. By signing below, the defined term “You” includes me and my employer.
@microsoft-github-policy-service agree company="Microsoft"
Contributor License Agreement

@microsoft-github-policy-service agree

borchero

I think that this is a good idea (possibly, modulo the naming for the new parameter). I personally prefer the explicitness of specifying a metric name over the implicitness of the "first metric" anyways.

IMO, one could even make this a breaking change/deprecate the usage of first_metric_only altogether.

Wdyt @jameslamb?

-- I did not review the code in detail, only left a small comment I noticed.

borchero · 2024-04-22T21:31:19Z

python-package/lightgbm/callback.py

@@ -363,11 +377,14 @@ def _init(self, env: CallbackEnv) -> None:
                    raise ValueError("Must provide a single value for min_delta or as many as metrics.")
                if self.first_metric_only and self.verbose:
                    _log_info(f"Using only {self.min_delta[0]} as early stopping min_delta.")
+                if (self.chosen_metric is not None) and self.verbose:
+                    index_chosen_metric = list_metrics.index(self.chosen_metric)


I don't think this works, list_metrics is not actually a list 👀

Will look into it.
Following @jameslamb comment, this part of the code will be impacted so I will need to rewrite it anyway

jameslamb · 2024-04-23T01:06:39Z

Wdyt @jameslamb?

I agree that first_metric_only is too restrictive, and that "first" is confusing and error-prone.

I'm open to modifying this interface, and agree with @borchero that if we're going to make this change it should eventually replace first_metric_only instead of permanently adding another layer of complexity to this interface.

I'd support the following set of changes:

add a metric_name: Optional[str] parameter to the end of the function signature for lightgbm.callback.early_stopping()
raise a deprecation warning if first_metric_only=True is passed, warning that it'll be removed in a future release of lightgbm
add an early_stopping_metric_name to LightGBM-wide parameters (see [python-package] Allow to pass early stopping min delta in params #6274 for reference implementation)

That'd match xgboost's behavior: https://github.com/dmlc/xgboost/blob/59d7b8dc72df7ed942885676964ea0a681d09590/python-package/xgboost/callback.py#L352.

I like having it be Optional[str] and not accepting e.g. a list or set, because it keeps the implementation simple. And if you want to perform early stopping on multiple metrics, you could pass multiple early_stopping callbacks, like this:

bst = lgb.train(
    ...,
    callbacks=[
        lgb.early_stopping(5, metric_name="rmse"),
        lgb.early_stopping(5, metric_name="mape")
    ]
)

sami-ka · 2024-04-23T21:39:20Z

Wdyt @jameslamb?

I agree that first_metric_only is too restrictive, and that "first" is confusing and error-prone.

I'm open to modifying this interface, and agree with @borchero that if we're going to make this change it should eventually replace first_metric_only instead of permanently adding another layer of complexity to this interface.

I'd support the following set of changes:

add a metric_name: Optional[str] parameter to the end of the function signature for lightgbm.callback.early_stopping()

raise a deprecation warning if first_metric_only=True is passed, warning that it'll be removed in a future release of lightgbm

add an early_stopping_metric_name to LightGBM-wide parameters (see [python-package] Allow to pass early stopping min delta in params #6274 for reference implementation)

That'd match xgboost's behavior: https://github.com/dmlc/xgboost/blob/59d7b8dc72df7ed942885676964ea0a681d09590/python-package/xgboost/callback.py#L352.

I like having it be Optional[str] and not accepting e.g. a list or set, because it keeps the implementation simple. And if you want to perform early stopping on multiple metrics, you could pass multiple early_stopping callbacks, like this:
bst = lgb.train(
    ...,
    callbacks=[
        lgb.early_stopping(5, metric_name="rmse"),
        lgb.early_stopping(5, metric_name="mape")
    ]
)

Clear to me. I especially like the design of having a list of callbacks with their own stopping rounds and deltas.

I will look at the PR you linked to match the implementation to modify the LightGBM-wide parameters.

jameslamb

I think this is going to require significantly more design work, sorry. Please see my comment.

Until you get more direction from us, we'd love your help with other parts of LightGBM if you are interested! For example, #6361.

jameslamb · 2024-04-26T21:54:02Z

include/LightGBM/config.h

@@ -397,6 +397,9 @@ struct Config {
  // desc = LightGBM allows you to provide multiple evaluation metrics. Set this to ``true``, if you want to use only the first metric for early stopping
  bool first_metric_only = false;

+  // desc = LightGBM allows you to provide multiple evaluation metrics. Set this to a specific metric name, if you want to use only this metric for early stopping
+  std::string chosen_metric_early_stopping;


Thinking about this some more... I don't think we should add this as a parameter in LightGBM's C API.

Right now, LightGBM (across all its interfaces), has this mix of behaviors:

you can provide multiple metrics via the metric parameter

if you set early_stopping_rounds > 0 and provide any validation sets, LightGBM will try to perform early stopping based on all metrics and all validation sets

... unless you set first_metric_only = true, in which case LightGBM will perform early stopping on only 1 metric (but still for all validation sets)

related: #6360

Two types of behavior rely on that metric parameter:

which metrics should be computed and logged/recorded during training?

which metrics should be used for early stopping?

We still want to provide the ability to record multiple metrics during training.

In addition, the CLI and C API don't have a concept of "callbacks", so a parameter metric_name that only accepts a single metric wouldn't be sufficient for them if they want to perform early stopping on the basis of multiple metrics.

We also have to think carefully about what breaking changes (if any) to make to LightGBM's existing behavior of automatically performing early stopping on all metrics if you enable early stopping at all.

I'm not sure what direction to set you on... need to think about this for a few days.

This is has been a quite complicated part of LightGBM's interface, I'd like to simplify it and give people finer control, but also do that in a way that minimizes the number of breaking changes made.

For example, maybe we could turn off the "automatically add the early stopping callback based on params" behavior if any lgb.early_stopping callbacks are passed through callbacks.

Thanks for your explanations, I now also realize all the implications of this change, adjusting the python part with others!

I also understand that being able to specify the metric_name in the parameters dict would be preferable, as other early stopping parameters can be specified here as well. However feel free to tell me to undo the changes outside of the Callback class, if it helps to split this in different PRs.
My tests with the callback API changes alone have the expected behaviour.

I will try to take a bit more look at the C API and give you my 2cents during the week-end about how the change could be implemented. I don't expect to come up with the solution, but I guess it could help you to take a decision.

Sure, thanks! I just want to be sure we're respectful of your time and limit how often we ask you to do something and then to undo it.

This is a part of LightGBM (and the Python package) that has to be handled with extreme care. Early stopping is a very important part of training GBDTs, and lots of existing code out in the world relies on the existing behavior.

If you want some more background on that,you might find this discussion useful: #5808

sami-ka added 3 commits April 20, 2024 22:52

feat: add chosen_metric attribute in early stopping callback class

4ce23bd

feat: add chosen_metric_early_stopping parameter to create early stop…

dbdc17c

…ping callback through parameters

feat: add chosen_metric_early_stopping to the list of parameters

53698f7

sami-ka requested review from guolinke, jameslamb, shiyu1994, jmoralez and borchero as code owners April 20, 2024 21:25

sami-ka mentioned this pull request Apr 20, 2024

[python-package] First metric is not used for early stopping #6423

Open

borchero reviewed Apr 22, 2024

View reviewed changes

jameslamb added in progress feature labels Apr 23, 2024

refactor: replace argument name and add chekcs +warning

44fcae2

sami-ka changed the title ~~Add chosen metric argument to clarify early stopping behaviour~~ [WIP] Add chosen metric argument to clarify early stopping behaviour Apr 26, 2024

jameslamb requested changes Apr 26, 2024

View reviewed changes

jameslamb mentioned this pull request Jun 10, 2024

Clarification on Early Stopping Behavior with Multiple eval_set in LightGBM #6475

Closed

jameslamb mentioned this pull request Dec 12, 2024

[python-package] factor out positional indexing for eval result tuples #6748

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Add chosen metric argument to clarify early stopping behaviour #6424

[WIP] Add chosen metric argument to clarify early stopping behaviour #6424

sami-ka commented Apr 20, 2024 •

edited

Loading

sami-ka commented Apr 20, 2024

borchero left a comment

borchero Apr 22, 2024

sami-ka Apr 23, 2024

jameslamb commented Apr 23, 2024 •

edited

Loading

sami-ka commented Apr 23, 2024

jameslamb left a comment

jameslamb Apr 26, 2024

sami-ka Apr 26, 2024

jameslamb Apr 26, 2024

[WIP] Add chosen metric argument to clarify early stopping behaviour #6424

Are you sure you want to change the base?

[WIP] Add chosen metric argument to clarify early stopping behaviour #6424

Conversation

sami-ka commented Apr 20, 2024 • edited Loading

sami-ka commented Apr 20, 2024

borchero left a comment

Choose a reason for hiding this comment

borchero Apr 22, 2024

Choose a reason for hiding this comment

sami-ka Apr 23, 2024

Choose a reason for hiding this comment

jameslamb commented Apr 23, 2024 • edited Loading

sami-ka commented Apr 23, 2024

jameslamb left a comment

Choose a reason for hiding this comment

jameslamb Apr 26, 2024

Choose a reason for hiding this comment

sami-ka Apr 26, 2024

Choose a reason for hiding this comment

jameslamb Apr 26, 2024

Choose a reason for hiding this comment

sami-ka commented Apr 20, 2024 •

edited

Loading

jameslamb commented Apr 23, 2024 •

edited

Loading