Skip to content

Commit

Permalink
[dask] add support for eval sets and custom eval functions (#4101)
Browse files Browse the repository at this point in the history
* es WiP, need to add eval_sample_weight and eval_group

* add weight, group to dask es. WiP.

* dask es reorg

* Update python-package/lightgbm/dask.py

_train_part model.fit args to lines

Co-authored-by: James Lamb <[email protected]>

* Update tests/python_package_test/test_dask.py

_train_part model.fit args to lines, pt2

Co-authored-by: James Lamb <[email protected]>

* Update python-package/lightgbm/dask.py

_train_part model.fit args to lines pt3

Co-authored-by: James Lamb <[email protected]>

* Update tests/python_package_test/test_dask.py

dask_model.fit args to lines

Co-authored-by: James Lamb <[email protected]>

* Update tests/python_package_test/test_dask.py

Co-authored-by: James Lamb <[email protected]>

* Update python-package/lightgbm/dask.py

use is instead of id()

Co-authored-by: James Lamb <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: James Lamb <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: James Lamb <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: James Lamb <[email protected]>

* Update tests/python_package_test/test_dask.py

Co-authored-by: James Lamb <[email protected]>

* Update tests/python_package_test/test_dask.py

Co-authored-by: James Lamb <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: James Lamb <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: James Lamb <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: James Lamb <[email protected]>

* applying changes to eval_set PR WiP

* dask support for eval_names, eval_metric, eval_stopping_rounds

* add evals_result checks and other eval_set attribute-related test checks. need to merge master - WiP

* fix lint errors in test_dask.py

* drop group_shape from _lgbmmodel_doc_fit.format for non-rankers, add support for eval_at for dask ranker

* add eval_at to test_dask eval_set ranker tests

* add back group_shape to lgbmmmodel docs, tighten tests

* drop random eval weights from early stopping, probably causing training to terminate too early

* add eval data templates to sklearn fit docs, add eval data docs to dask

* add n_features to _create_data, eval_set tests stop w/ desirable tree counts

* import alphabetically

* add back get_worker for eval_set error handling

* test_dask argmin typo

* push forgotten eval_names bugfix

* eval_stopping_rounds -> early_stopping_rounds, fix failing non-es test

* change default eval_at to tuple 1-5

* re-drop get_worker

* drop early stopping support from eval_set commits, move eval_set worker check prior to client.submit

* add eval_class_weight and eval_init_score to lightgbm/dask, WiP

* clean up eval_set tests, allow user to specify fewer eval_names, clswghts than eval_sets

* remove redundant backslash

* lint fixes

* fix eval_at, eval_metric duplication, let eval_at be Iterable not just Tuple

* use all data_outputs for test_eval_set tests

* undo newlines from first pr

* add custom_eval_metric test, correct issue with eval_at and metric names

* move _constant_metric outside of test

* dataset reference names instead of __strings__

* add padding to eval_set parts makes each part has same len(eval_set)

* eval set code clean up

* revert n_evals to be max len eval_set across all parts on worker

* pylint errors in _DatasetNames

* more pylint fixes

* pylinting...

* add by pytest.mark, mistakenly deleted during merge conflict resolution

* address code review comments

* add _pad_eval_names to handle nondeterministic evals_result_ valid set names

* change not evaluated evals_result_ test criteria

* address fit eval docs issues, switch _DatasetNames to Enum

* Update python-package/lightgbm/dask.py

Co-authored-by: Nikita Titov <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: Nikita Titov <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: Nikita Titov <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: Nikita Titov <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: Nikita Titov <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: Nikita Titov <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: Nikita Titov <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: Nikita Titov <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: Nikita Titov <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: Nikita Titov <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: Nikita Titov <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: Nikita Titov <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: Nikita Titov <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: Nikita Titov <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: Nikita Titov <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: Nikita Titov <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: Nikita Titov <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: Nikita Titov <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: Nikita Titov <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: Nikita Titov <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: Nikita Titov <[email protected]>

* Update python-package/lightgbm/dask.py

Co-authored-by: Nikita Titov <[email protected]>

* update eval_metrics, eval_at dask fit docstr to match sklearn, make tests reflect that l2 (rmse), logloss in evals_result_ by default

* address eval_set dict keys naming in docstr and training eval_set naming issue

* in test_dask check for obj-default metric names in eval_results, remove check for training key

* lint fixes for _pad_eval_names

* remove unnecessary breaklinen in _pad_eval_names docstr

* use Enum.member syntax not Enum.member.name

* remove str from supported eval_at types

* add whitespace and remove DaskDataframes mention from eval_ param docstrs in _train

* remove "of shape = [n_samples]" from group_shape docs

* add eval_at base_doc in DaskLGBMRanker.fit

* remove excess paren from eval_names docs in _train

* make requested changes to test_dask.py

* remove Optional() wrapper on eval_at

* add _lgbmmodel_doc_custom_eval_note to dask.py fit.__doc__

* fix ordering of .sklearn imports to attempt lint fix

* dask custom eval note to f-string pt1

Co-authored-by: Nikita Titov <[email protected]>

* dask custom eval note to f-string pt 2

Co-authored-by: Nikita Titov <[email protected]>

* dask custom eval note to f-string pt 3

Co-authored-by: Nikita Titov <[email protected]>

Co-authored-by: James Lamb <[email protected]>
Co-authored-by: Nikita Titov <[email protected]>
  • Loading branch information
3 people authored Jun 28, 2021
1 parent bb39bc9 commit b5502d1
Show file tree
Hide file tree
Showing 3 changed files with 678 additions and 16 deletions.
Loading

0 comments on commit b5502d1

Please sign in to comment.