Update ens builder #1434

eddiebergman · 2022-03-27T23:36:32Z

Cleans up ensemble builder to make it easier to change and test.

TODO

codecov · 2022-03-28T00:28:44Z

Codecov Report

Merging #1434 (92f59c2) into development (35d4d22) will decrease coverage by 0.27%.
The diff coverage is 78.33%.

@@               Coverage Diff               @@
##           development    #1434      +/-   ##
===============================================
- Coverage        84.32%   84.05%   -0.28%     
===============================================
  Files              147      151       +4     
  Lines            11397    11448      +51     
  Branches          1986     1988       +2     
===============================================
+ Hits              9611     9623      +12     
- Misses            1261     1298      +37     
- Partials           525      527       +2

mfeurer

Partial review.

autosklearn/util/functional.py

autosklearn/ensemble_building/run.py

autosklearn/ensemble_building/manager.py

autosklearn/ensemble_building/builder.py

mfeurer

Some more feedback. I'll check the tests later.

autosklearn/ensemble_building/builder.py

autosklearn/ensemble_building/manager.py

autosklearn/ensemble_building/builder.py

mfeurer

Final part of the review.

test/test_ensemble_builder/test_ensemble_builder.py

test/test_estimators/test_estimators.py

mfeurer · 2022-05-11T15:28:05Z

test/test_ensemble_builder/test_manager.py

+        logger_port=automl._logger_port,
+        random_state=DEFAULT_SEED,
+    )
+    return manager


Just out of curiosity, where is this manager used later? I don't see right now where the test for this is located.

No where, I should have some test that use it. Sorry, slipped the todos

Just checking: did you add such a check?

I'll do so today, there's not too much functionality to check as most of the heavy lifting is done inside EnsembleBuilder. The real runs didn't fail so I assume it's okay, meaning I could start the benchmarking and write the tests after, almost nothing has changed. Hopefully no bugs appear in the testing

autosklearn/ensemble_building/builder.py

mfeurer

Final part of the review.

* Move ensemble_bulder test data to named folder * Update backend to take a temlate to copy from * Update tests to use new cases system * Update tests to be documented and cleaned up * Switch to using cached automl backends * Readd missing file which failed test for `case_3_models` * Seperate out tests that rely on old toy data and those that don't * Setup test framework for ensemble builder on real situations * Formatting * Remove `unit_test` arg * Remove SAVE2DISC * Split builder and manager into seperate files * Tidy up init of EnsembleBuilder * Moved to cached properties * Change List to list * Move to solely using cached properties * Add disk util file with `sizeof` * Update tests to use cached mechanism * Switch `sizeof` for disk consumption * Remove disk consumption * Remove unneeded function * Add type hints and documenation * Simplyify _read_np_fn * Update get_valid_test_preds to use Pathlib * Add intersection to functional * Make functional take *args * Further simplifications * Add a dataclass to represent run information for builder * Rename to Run * Change to Run objects * Formatting * Reduce side effects of `compute_loss_per_model` To make testing easier and changes easier, the targets are now passed to the method. This also reduces it's complexity by removing the checking from the method as we can assume the parameters coming in are correct. * Change Tuple to tuple * Forcibly add data files for tests * Fix: Can now load pickled numpy arrays w/ test * Add test for checking ensemble builder output * Fix bug with using list instead of set * Making deubgging message a little clearer * Fix typing and case name * Rename test file to reflect what it tests * Make pynisher context optional * Fix loaded models test * Updates to Run dataclass * Add method to `Run` to allow recording of last modified * Change Run mtimes to dictionary * Change `compute_loss_per_model` to use new Run dataclass * Factor out run loss into main loop * Simplyify get_nbest and compute_losses * Major rewrite of ensemble builder main loop * Change to simpler hashing * Start value split * Add `value_split` * Reworked Builder * Add some docstring * Formatting * Fix type signature * Fix typing for `loss` * Removed Literal * Mypy fixes for ensemble builder * Mypy fixes * Tests for `Runs` * Move `make_run` to fixtures * Fix run deletion * Test candidates * Made delete it's own function * Further simplifications * Fixup test with simplification * Test: `max_models` for `requires_deletion` * Test: `memory_limit` for `requires_deletion` * Test: Loss of runs * Test: Delete runs * Test: `fit_ensemble` of ensemble builder * Add test for run time parameter * Remove parameter `return_predictions` * Add note about pickled arrays should not be supported * Make cached automl instances copy backend * Add valid static method to run * Remove old test data * Add filter for bad run dirs * Made `main` args optional * Fix check for updated runs * Make `main` raise errors * Fix default value for ensemble builder `main` * Test valid ensemble with real runs * Rename parameter for manager * Add defaults and reorder parameters for EnsembleBuilderManager * Fixup parameters in `fit_and_return_ensemble` * Typing fixes * Make `fit_and_return_ensemble` a staticmethod * Add: `make_ensemble_builder_manager` * Add: Test files for manager * Add atomic rmtree * Add: atomic rmtree now accepts where mv should go * Make builder use atomic rmtree * Fix import bugs, remove valid preds in builder * Remove `np.inf` as valid arg for `read_at_most` * Possible reproducible num_run, no predictions error * Make automl caching robust to `pytest-xdist` * Test fixes * Extend interval for test on run caching * Use pickle for reseting cache * Fix test for caching mechanism to not rely on `stat` * Move run deletion to the end of the builder `main` * Remove `getattr` version of tae.client * Remove `normalize` * Extend not for `Run` * Fix `__init__` of `Run` * Parameter and comment fixes from feedback * Change to `min(...)` instead of `sorted(...)[0]` * Make default time `np.inf` * Add test for safe deletion in builder * Update docstring of `loss` for a run * Remove stray print * Minor feedback fixes * Fix `_metric` to `_metrics` * Fix `make_ensemble_builder` * One more fix for multiple metrics

eddiebergman added 27 commits March 27, 2022 00:20

Move ensemble_bulder test data to named folder

092985d

Update backend to take a temlate to copy from

83db9cf

Update tests to use new cases system

dc4585e

Update tests to be documented and cleaned up

f28c3e4

Switch to using cached automl backends

9613312

Readd missing file which failed test for case_3_models

a20150c

Seperate out tests that rely on old toy data and those that don't

84d01e7

Setup test framework for ensemble builder on real situations

fcf6ad0

Formatting

951bb2e

Remove unit_test arg

5abf258

Remove SAVE2DISC

3e8ed92

Split builder and manager into seperate files

5dd9832

Tidy up init of EnsembleBuilder

c0ebad5

Moved to cached properties

07d2c55

Change List to list

8ac8ffe

Move to solely using cached properties

6472714

Add disk util file with sizeof

36d7dd6

Update tests to use cached mechanism

5c9842f

Switch sizeof for disk consumption

1de376c

Remove disk consumption

23de0fb

Remove unneeded function

e34100d

Add type hints and documenation

2d90370

Simplyify _read_np_fn

1fe4c61

Update get_valid_test_preds to use Pathlib

facbd7f

Add intersection to functional

9e6169b

Make functional take *args

ebb2c78

Further simplifications

d0f0980

eddiebergman added 2 commits March 29, 2022 19:31

Add a dataclass to represent run information for builder

9903b74

Rename to Run

3ff5873

eddiebergman added 2 commits May 6, 2022 14:00

Use pickle for reseting cache

cc45300

Fix test for caching mechanism to not rely on stat

65ec881

eddiebergman mentioned this pull request May 7, 2022

Test for pytest -n 4 with pytest xdist with dask_client fixtures #1446

Closed

mfeurer reviewed May 10, 2022

View reviewed changes

eddiebergman added 7 commits May 11, 2022 12:44

Move run deletion to the end of the builder main

3c218e4

Remove getattr version of tae.client

0fc809e

Remove normalize

b175bb0

Extend not for Run

2bd0c01

Fix __init__ of Run

25defe8

Parameter and comment fixes from feedback

82c68f0

Change to min(...) instead of sorted(...)[0]

ef7848f

mfeurer reviewed May 11, 2022

View reviewed changes

eddiebergman added 5 commits May 11, 2022 19:03

Make default time np.inf

c990e60

Add test for safe deletion in builder

6476856

Update docstring of loss for a run

936fba5

Remove stray print

8695049

Minor feedback fixes

c2111c2

mfeurer approved these changes May 12, 2022

View reviewed changes

mfeurer mentioned this pull request May 13, 2022

Refactor the concept of public and private test set #1474

Closed

eddiebergman added 4 commits May 13, 2022 15:21

Merge branch 'development' into update_ens_builder

a515016

Fix _metric to _metrics

b326cc9

Fix make_ensemble_builder

4e4ea64

One more fix for multiple metrics

92f59c2

eddiebergman merged commit 0b5fa19 into development May 13, 2022

mfeurer deleted the update_ens_builder branch May 13, 2022 16:31

github-actions bot pushed a commit that referenced this pull request May 13, 2022

Eddie Bergman: Update ens builder (#1434)

b9b23be

github-actions bot pushed a commit to automl-private/auto-sklearn that referenced this pull request May 16, 2022

Eddie Bergman: Update ens builder (automl#1434)

538c7ef

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update ens builder #1434

Update ens builder #1434

eddiebergman commented Mar 27, 2022

codecov bot commented Mar 28, 2022 •

edited

Loading

mfeurer left a comment

mfeurer left a comment

mfeurer left a comment

mfeurer May 11, 2022

eddiebergman May 11, 2022

mfeurer May 12, 2022

eddiebergman May 12, 2022

mfeurer left a comment

Update ens builder #1434

Update ens builder #1434

Conversation

eddiebergman commented Mar 27, 2022

codecov bot commented Mar 28, 2022 • edited Loading

Codecov Report

mfeurer left a comment

Choose a reason for hiding this comment

mfeurer left a comment

Choose a reason for hiding this comment

mfeurer left a comment

Choose a reason for hiding this comment

mfeurer May 11, 2022

Choose a reason for hiding this comment

eddiebergman May 11, 2022

Choose a reason for hiding this comment

mfeurer May 12, 2022

Choose a reason for hiding this comment

eddiebergman May 12, 2022

Choose a reason for hiding this comment

mfeurer left a comment

Choose a reason for hiding this comment

codecov bot commented Mar 28, 2022 •

edited

Loading