-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python-package] reorganize early stopping callback #6114
Conversation
python-package/lightgbm/callback.py
Outdated
if self.stopping_rounds <= 0: | ||
raise ValueError(f"stopping_rounds should be greater than zero. got: {self.stopping_rounds}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this doesn't need the env we could move it to __init__
, WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree! Having it be a loud error right when the callback is created, instead of deferred all the way til the first iteration of training, seems useful. And I'd be surprised to learn that there are other libraries or user code depending on initializing lgb.early_stopping()
with a negative value of this and then somehow updating the value before the first time it's called.
Moved into __init__()
in bd3366a.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing this broke this test:
LightGBM/tests/python_package_test/test_engine.py
Lines 4502 to 4507 in f175ceb
def test_train_raises_informative_error_for_params_of_wrong_type(): | |
X, y = make_synthetic_regression() | |
params = {"early_stopping_round": "too-many"} | |
dtrain = lgb.Dataset(X, label=y) | |
with pytest.raises(lgb.basic.LightGBMError, match="Parameter early_stopping_round should be of type int, got \"too-many\""): | |
lgb.train(params, dtrain) |
Now the error from the early stopping callback gets thrown before this one from the C++ side:
Line 37 in f175ceb
Log::Fatal("Parameter %s should be of type int, got \"%s\"", key.c_str(), candidate); |
So I pushed 7a98d82, which:
- switches that test to use a different parameter, to keep covering that C++-side validation
- adds a test in
test_callback.py
on this specific error fromlgb.early_stopping()
- adds an
isinstance()
check in the condition guarding that error inlgb.early_stopping()
, so you can an informative error instead of something likeTypeError: '<=' not supported between instances of 'str' and 'int'
Given all those changes, @jmoralez could you re-review? I don't want to sneak those in on your previous approval.
Co-authored-by: José Morales <[email protected]>
…htGBM into python/mypy-early-stopping
lgb.early_stopping(stopping_rounds=-1) | ||
|
||
with pytest.raises(ValueError, match="stopping_rounds should be an integer and greater than 0. got: neverrrr"): | ||
lgb.early_stopping(stopping_rounds="neverrrr") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haha thank you, thank you 😂
This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
Contributes to #3756.
Contributes to #3867.
Fixes the following errors from mypy:
Those all come from the fact that when
env.model
inCallbackEnv
is aCVBooster
, any attributes not explicitly defined on theCVBooster
class return a method that is called on each of theBooster
objects in.boosters
LightGBM/python-package/lightgbm/engine.py
Lines 357 to 365 in 60a4a13
For example:
mypy
is rightly complaining that given that, it isn't safe to treatenv.model._train_data_name
as if it was a string... since for aCVBooster
, it won't be.This PR proposes some changes to avoid those cases in the early stopping callback.
It also proposes some other changes to make the code in that callback a bit easier to understand (for example, calling
_EarlyStoppingCallback._is_train_set()
with keyword arguments).