Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make backend deletion robust after fit #1155

Merged
merged 10 commits into from
Aug 6, 2021

Conversation

franchuterivera
Copy link
Contributor

Addresses #1149 which is caused by destructor being garbage collected when the modules where already terminated, as explained here.

@codecov
Copy link

codecov bot commented Jun 14, 2021

Codecov Report

Merging #1155 (0aa71c9) into development (d761cb3) will decrease coverage by 0.12%.
The diff coverage is 100.00%.

Impacted file tree graph

@@               Coverage Diff               @@
##           development    #1155      +/-   ##
===============================================
- Coverage        88.12%   88.00%   -0.13%     
===============================================
  Files              139      139              
  Lines            10955    10959       +4     
===============================================
- Hits              9654     9644      -10     
- Misses            1301     1315      +14     
Impacted Files Coverage Δ
autosklearn/automl.py 85.13% <100.00%> (+0.11%) ⬆️
autosklearn/estimators.py 93.23% <100.00%> (-0.07%) ⬇️
...eline/components/feature_preprocessing/fast_ica.py 91.30% <0.00%> (-6.53%) ⬇️
...mponents/feature_preprocessing/nystroem_sampler.py 85.29% <0.00%> (-5.89%) ⬇️
autosklearn/util/backend.py 91.94% <0.00%> (-2.55%) ⬇️
autosklearn/evaluation/abstract_evaluator.py 92.12% <0.00%> (-0.79%) ⬇️
...ine/components/classification/gradient_boosting.py 93.91% <0.00%> (+0.86%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d761cb3...0aa71c9. Read the comment docs.

@franchuterivera franchuterivera marked this pull request as ready for review June 14, 2021 16:18
@franchuterivera franchuterivera requested a review from mfeurer June 14, 2021 16:19
@mfeurer
Copy link
Contributor

mfeurer commented Jun 30, 2021

Hey, could you please rebase this to the latest changes in the development branch?

@@ -175,6 +175,7 @@ def cli_start_worker(scheduler_file_name):
X_train, X_test, y_train, y_test = \
sklearn.model_selection.train_test_split(X, y, random_state=1)
automl = AutoSklearnClassifier(
delete_tmp_folder_after_terminate=False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, what's the reason for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very good question. I have made it so, that backend is created on a need basis and destroyed after it is no longer needed. We cannot relay on the del method.

If you look a couple of lines below, we do fit_ensemble. If we do not preserve the backend after fit, the backend directory will be removed.

@@ -123,6 +123,7 @@ async def do_work():
# 3. Start the client
with dask.distributed.Client(address=cluster.scheduler_address) as client:
automl = AutoSklearnClassifier(
delete_tmp_folder_after_terminate=False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here?

@@ -47,11 +47,10 @@ def __del__(self):
pass


def test_fit(dask_client, backend):
def test_fit(dask_client):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the overall difference here will be that the temporary directories will be at random while they were given by a pytest fixture before?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is anyone taking care that the temporary directories which are now no longer deleted by Auto-sklearn or the fixture are removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. If we do delete_tmp_folder_after_terminate=False then they will remain on the temporary directory and will get deleted after the laptop is rebooted. This happens in some test, but not many.

I thought about creating a custom code for deleting this, but as the name is random and did not want to complicate the code.

However I can create a fixture at the begining/end that uses the temporary_directory and clean it if you think we should do this.

@@ -62,6 +62,7 @@ def __call__(self, *args, **kwargs):
get_smac_object_wrapper_instance = get_smac_object_wrapper()

automl = AutoSklearnClassifier(
delete_tmp_folder_after_terminate=False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this now necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because after fit there is no need for the backend to survive, yet the test still wants to go to the temporary directory and check for the correctness of some files.

@franchuterivera
Copy link
Contributor Author

Just to set some context in the PR:

The problem statement is that when calling __del__ the import os is no longer valid. That is, os has already been terminated.

The backend gets deleted as of now when the object is garbage collected. This does not work due to the point above. So this PR tried to move away from garbage collection deletion to delete the backend when is not needed, after fit().

Something that I did not explore is using the atexit from python which might help.

@eddiebergman eddiebergman merged commit d94ff6d into automl:development Aug 6, 2021
eddiebergman added a commit that referenced this pull request Aug 18, 2021
* Make backend deletion robust after fit

* [FIX] smbo unittest

* Output folder is needed for ensemble output

* fit_ensemble also runs in manual spawning

* fit_ensemble also runs in manual spawning

* Rebase artifacts

* Fix rebase

* Missed arg in examples

* Remove rebase artifacts

Co-authored-by: Eddie Bergman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants