-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix behavior for default objective and metric #4660
Conversation
7d405c6
to
6f32a78
Compare
6f32a78
to
4ee70d1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm supportive of the core change here, thanks for the quick response to #4655 !
I just left a few small questions.
Also...would you consider adding (fixes #4655)
in the PR title? I think it's useful for that information to show up in the entry on release notes, so it's easy for someone reading the release notes to hover over the issue and see a preview of the problem a PR was trying to solve. I think this is useful because PR titles / release note entries are often written to answer the question "what changed?", but as a user reviewing the changelog you also care about "why did this change?".
For example, from https://github.com/microsoft/LightGBM/releases/tag/v3.3.0
dumped_model <- jsonlite::fromJSON(bst$dump_model()) | ||
expect_identical(bst_from_ds$eval_train(), list()) | ||
expect_equal(bst_from_ds$current_iter(), nrounds) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand why the eval_train()
part of this test changed, but how is this current_iter()
change related to the rest of the PR?
I think the line
expect_equal(bst_from_ds$current_iter(), nrounds)
is testing that a Booster created from a Dataset with a Predictor
is already considered "fitted", and that the current_iter()
attribute reflects the specific training that was done to produce that Predictor
.
I think just testing that bst_from_ds
is a Booster
isn't enough to test that the merging here worked:
LightGBM/R-package/R/lgb.Booster.R
Lines 55 to 70 in f3987f3
private$init_predictor <- train_set$.__enclos_env__$private$predictor | |
# Check if predictor is existing | |
if (!is.null(private$init_predictor)) { | |
# Merge booster | |
.Call( | |
LGBM_BoosterMerge_R | |
, handle | |
, private$init_predictor$.__enclos_env__$private$handle | |
) | |
} | |
# Check current iteration | |
private$is_predicted_cur_iter <- c(private$is_predicted_cur_iter, FALSE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is line 421 duplicate? (compared with line 419).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That line expect_equal(bst_from_ds$current_iter(), nrounds)
is still in the test:
expect_equal(bst_from_ds$current_iter(), nrounds) |
As @shiyu1994 correctly noticed that line was duplicated and I removed one duplicate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ohhhh I see! My mistake, wasn't obvious to me. Thank you.
@@ -85,18 +85,15 @@ void GetObjectiveType(const std::unordered_map<std::string, std::string>& params | |||
} | |||
} | |||
|
|||
void GetMetricType(const std::unordered_map<std::string, std::string>& params, std::vector<std::string>* metric) { | |||
void GetMetricType(const std::unordered_map<std::string, std::string>& params, const std::string& objective, std::vector<std::string>* metric) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as a general, rule, I like to avoid inserting new arguments in the middle of a function signature. In my experience, I've found that it leads to fewer mistakes when updating the places that call that function.
I think this is especially important in C/C++ code, where all arguments are provided positionally.
Would you consider adding objective
at the end of this signature? Or is there a specific reason that objective
needs to come before metric
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that was a semantic reason. In all such kind functions (GetXXXType()
) in the config "returned" value is the last argument. It is a common practice in C++ to specify out-parameters at the very end of parameters. Also, semantically it can be read as "based on params and objective set metric". Appending objective
to the end of parameters will be more confusing than benefiting, I believe. Given that this API is internal one, I think there is no problem with inserting objective
in the middle of parameters and keep semantic order of parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok makes sense to me, thanks very much for the thorough explanation
Honestly, I'm against of inserting From the functionality perspective, GitHub auto-closing feature should work fine when triggering word is used in PR description or commit message. As for PR titles themselves, I believe they should be as brief and descriptive as possible. Any additional info like mentions of issue PR is resolving can disperse reader's attention. One can click to PR and read the whole description any time it is really needed. Quite often one PR can resolve multiple issues (for instance, dmlc/xgboost#7297). Write all issues in a title will be ugly, given that GitHub forces to prepend each issue number with And who knows, maybe they will do the same in titles not only in comments... |
This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
Fixed #4655.
Configure metric after objective and use configured objective as a default metric if user doesn't provide any.
By default,
objective
isregression
and can be omitted in paramsLightGBM/include/LightGBM/config.h
Line 139 in f3987f3