Development #1192

mfeurer · 2021-07-27T22:21:15Z

No description provided.

Synchronize dev and master again

* Implemented `def leaderboard` Still requires testing, only works for classification * Fixed some bugs * Updated function with new params * Cleaned info gathering a little * Identifies if classifier or regressor models * Implemented sort_by param * Added ranking column * Implemented ensemble_only param for leadboard * Implemented param top_k * flake8'd * Created fixtures for use with test_leaderboard * Moved fixtures to conftest, added session scope tmp_dir For the autoML models to be useable for the entire session without training, they require a session scoped tmp_dir. I tried to figure out how to make the tmp_dir more dynamic but documentation seems to imply that the scope is set at *function definition*, not on function call. This means either call the _tmp_dir and manually clean up or just duplicate the tmp_dir function but aptly named for session scope. It's a bit ugly but couldn't find an alternative. * Can't make tmp_dir for session scope fixtures Doesn't populate the request.module object if requesting from a session scope. For now module will have to do * Reverted back, models trained in test * Moved `leaderboard` AutoML -> AutoSklearnEstimator * Added fuzzing test for test_leaderboard * Added tests for leaderboard, added sort_order * Removed Type Final to support python 3.7 * Removed old solution to is_classication for leaderboard * I should really force pre-commit to run before commit (flake8 fixes) * More occurences of Literal * Readded Literal but imported from typing_extensions * Fixed docstring for sphinx * Added make command to build html without running examples * Added doc/examples to gitignore Generating the sphinx examples causes output to be generated in doc/examples. Not sure if this should be pushed considering docs/build is not. * Added leadboard to basic examples Found a bug: /home/skantify/code/auto-sklearn/examples/20_basic/example_multilabel_classification.py failed to execute correctly: Traceback (most recent call last): File "/home/skantify/code/auto-sklearn/examples/20_basic/example_multilabel_classification.py", line 61, in <module> print(automl.leaderboard()) File "/home/skantify/code/auto-sklearn/autosklearn/estimators.py", line 738, in leaderboard model_runs[model_id]['ensemble_weight'] = self.automl_.ensemble_.weights_[i] KeyError: 2 * Cleaned up _str_ of EnsembleSelection * Fixed discrepancy between config_id and model_id There is a discrepency between identifiers used by SMAC and and the identifiers used by an Ensemble class. SMAC uses `config_id` which is available for every run of SMAC while Ensemble uses `model_id == num_run` which is only available in runinfo.additional_info. However, this is not always included in additional_info, nor is additional_info garunteed to exist. Therefore the only garunteed unique identifier for models are `config_id`s which can confuse the user if they wise to interact with the ensembler. * Readded desired code for design choice on model indexing There are two indexes that can be used, SMAC uses `config_id` and asklearn uses `num_run`, these are not garunteed to be equal and also `num_run` is not always present. As the user should not care that there is possible 2 indexes for models, made the choice to show `config_id` as this allows displaying info on failed runs. An alternative to show asklearn's `num_run` index is just to exclude any failed runs from showing up in the leaderboard. * Removed Literal again as typing_extensions is external module * Switched to model_id as primary id Any runs which do not provide a model_id == num_run are essentially discarded. This hsould change in the future but the fix is outside the scope of the PR. * pre-commit flake8 fix * Logger gives warning if sort_by is not in columns asked for * Moved column types to static method * Fixed rank to be based on cost * Fixed so model_id can be requested, even though it always exists * Fixed so rank can be calculated even if cost not requested * Readded Literal and included typing_extension dependancy Once Python 3.7 is dropped, we can drop typing_extensions * Changed default sort_order to 'auto' * Changed leaderboard columns to be static attributes * Update budget doc Co-authored-by: Matthias Feurer <[email protected]> * flake8'd Co-authored-by: Matthias Feurer <[email protected]>

* Fixes for valid parameters not being tested * flake8'd

* Changes required to test if will work with smac@development * Changes required to test if will work with smac@development * Fixed failing tests with new scipy 1.7 on sparse data * flake8'd * Use SMAC from pypi again * undo changes Co-authored-by: Matthias Feurer <[email protected]>

codecov · 2021-07-27T22:53:38Z

Codecov Report

Merging #1192 (96b9ad0) into master (904a692) will increase coverage by 2.24%.
The diff coverage is 93.67%.

@@            Coverage Diff             @@
##           master    #1192      +/-   ##
==========================================
+ Coverage   85.91%   88.15%   +2.24%     
==========================================
  Files         138      138              
  Lines       10790    10866      +76     
==========================================
+ Hits         9270     9579     +309     
+ Misses       1520     1287     -233

Impacted Files	Coverage Δ
autosklearn/automl.py	`85.00% <ø> (ø)`
autosklearn/estimators.py	`93.36% <93.33%> (-0.07%)`	⬇️
autosklearn/__version__.py	`100.00% <100.00%> (ø)`
autosklearn/ensembles/ensemble_selection.py	`69.17% <100.00%> (+1.81%)`	⬆️
...ipeline/components/regression/gradient_boosting.py	`93.26% <0.00%> (+0.96%)`	⬆️
...osklearn/pipeline/components/classification/sgd.py	`96.87% <0.00%> (+1.04%)`	⬆️
autosklearn/pipeline/components/regression/sgd.py	`96.84% <0.00%> (+1.05%)`	⬆️
...earn/pipeline/components/regression/extra_trees.py	`93.75% <0.00%> (+1.25%)`	⬆️
...rn/pipeline/components/regression/random_forest.py	`94.44% <0.00%> (+1.38%)`	⬆️
... and 14 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 904a692...96b9ad0. Read the comment docs.

mfeurer and others added 5 commits July 27, 2021 14:26

Merge pull request #1181 from automl/master

611cf5c

Synchronize dev and master again

Leaderboard rank fix (#1191)

53daf7e

* Fixes for valid parameters not being tested * flake8'd

bump version number for new release

5dd38be

Merge branch 'master' into development

96b9ad0

mfeurer merged commit 3d53cd9 into master Jul 28, 2021

github-actions bot pushed a commit that referenced this pull request Jul 28, 2021

Matthias Feurer: Merge pull request #1192 from automl/development

50fb4fb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Development #1192

Development #1192

mfeurer commented Jul 27, 2021

codecov bot commented Jul 27, 2021 •

edited

Loading

Development #1192

Development #1192

Conversation

mfeurer commented Jul 27, 2021

codecov bot commented Jul 27, 2021 • edited Loading

Codecov Report

codecov bot commented Jul 27, 2021 •

edited

Loading