Add faq #1109

mfeurer · 2021-03-26T16:50:26Z

No description provided.

codecov · 2021-03-26T17:25:38Z

Codecov Report

Merging #1109 (b675954) into development (79627e1) will decrease coverage by 0.01%.
The diff coverage is n/a.

@@               Coverage Diff               @@
##           development    #1109      +/-   ##
===============================================
- Coverage        85.82%   85.80%   -0.02%     
===============================================
  Files              137      137              
  Lines            10625    10625              
===============================================
- Hits              9119     9117       -2     
- Misses            1506     1508       +2

Impacted Files	Coverage Δ
...ature_preprocessing/select_rates_classification.py	`85.91% <0.00%> (-1.41%)`	⬇️
...ine/components/classification/gradient_boosting.py	`91.30% <0.00%> (-0.87%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 79627e1...b675954. Read the comment docs.

KEggensperger · 2021-04-13T10:26:42Z

doc/faq.rst

+``forkserver`` and ``spawn``. The default ``fork`` copies the whole process memory into the
+subprocess. If the main process already uses 1.5GB of main memory and we apply a 3GB memory
+limit to Auto-sklearn, it will only be able to use 1.5GB of that. We would have loved to use
+``forkserver`` or ``spawn`` instead, which both don't suffer from this issue (and have some


What is "this issue"? The mentioned link refers to deadlocks when using multi processing and how to solve them.

KEggensperger · 2021-04-13T10:28:39Z

doc/faq.rst

+memory limit and a time limit. To start such a process, Python gives three options: ``fork``,
+``forkserver`` and ``spawn``. The default ``fork`` copies the whole process memory into the
+subprocess. If the main process already uses 1.5GB of main memory and we apply a 3GB memory
+limit to Auto-sklearn, it will only be able to use 1.5GB of that. We would have loved to use


it -> executing a machine learning algorithm is limited to use at most 1.5GB.

KEggensperger · 2021-04-13T10:29:26Z

doc/faq.rst

+
+There are now two possible solutions:
+
+1. Use parallel Auto-sklearn: if you use Auto-sklean in parallel, it defaults to ``forkserver``


Is there a link to an example or a hint how to do this? Is this just setting the n_jobs flag?

KEggensperger · 2021-04-13T10:29:45Z

doc/faq.rst

+
+We therefore suggest using one of the above settings by default.
+
+Auto-sklearn is extremely memory hungry in a sequential setting


This is the same title as above

KEggensperger · 2021-04-13T10:31:28Z

doc/faq.rst

+
+When running Auto-sklearn in a parallel setting it starts new processes for evaluating machine
+learning models using the ``forkserver`` mechanism. If not all code in the main script is guarded
+by ``if __name__ == "__main__"`` it is executed for each subprocess. If now part of the code that


If now ... your RAM -> This sentence is hard to parse.
If the code loading your dataset is not guarded, it is executed for every evaluation of a machine learning algorithm and thus blocking your RAM.

doc/faq.rst

KEggensperger · 2021-04-13T11:15:01Z

doc/faq.rst

+-----------------------------------
+
+In certain cases, for example for debugging, it can be helpful to limit the number of
+models to try. We do not provide this as an argument in the API as we believe that it


model evaluations

doc/faq.rst

KEggensperger · 2021-04-13T11:20:36Z

doc/manual.rst

@@ -77,6 +77,8 @@ For a full list please have a look at the source code (in `autosklearn/pipeline/
  * `Regressors <https://github.com/automl/auto-sklearn/tree/master/autosklearn/pipeline/components/regression>`_
  * `Preprocessors <https://github.com/automl/auto-sklearn/tree/master/autosklearn/pipeline/components/feature_preprocessing>`_

+We do also provide an example `on how to restrict the classifiers to search over <examples/80_advanced/example_interpretable_models.html>`_.


I believe this is in "40_advanced"

doc/faq.rst

This was referenced Mar 26, 2021

White box only AutoML with auto-sklearn #1033

Closed

Document list of datasets meta learning datasets #502

Closed

KEggensperger approved these changes Apr 13, 2021

View reviewed changes

mfeurer mentioned this pull request Apr 14, 2021

ValueError: Dummy prediction failed with run state StatusType.MEMOUT #1120

Closed

mfeurer added 6 commits May 21, 2021 14:03

WIP: add FAQ

c111acb

undo accidental change

b7cd9ba

check in actual FAQ

c57b8bd

some more answers

2d94991

add section to FAQ

ab56170

Update FAQ based on feedback

75aed82

mfeurer force-pushed the ADD_FAQ branch from a5700b7 to 75aed82 Compare May 21, 2021 12:03

fix link

227365a

mfeurer requested a review from KEggensperger May 21, 2021 12:25

KEggensperger reviewed May 22, 2021

View reviewed changes

incorporate feedback

b891c22

mfeurer requested a review from KEggensperger May 23, 2021 15:20

KEggensperger approved these changes May 27, 2021

View reviewed changes

doc/faq.rst Outdated Show resolved Hide resolved

doc/faq.rst Outdated Show resolved Hide resolved

include feedback

b675954

mfeurer merged commit 5a72e52 into development May 27, 2021

mfeurer deleted the ADD_FAQ branch May 27, 2021 13:28

github-actions bot pushed a commit that referenced this pull request May 27, 2021

Matthias Feurer: Add faq (#1109)

ecf7527

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add faq #1109

Add faq #1109

mfeurer commented Mar 26, 2021

codecov bot commented Mar 26, 2021 •

edited

Loading

KEggensperger Apr 13, 2021

KEggensperger Apr 13, 2021

KEggensperger Apr 13, 2021

KEggensperger Apr 13, 2021

KEggensperger Apr 13, 2021

KEggensperger Apr 13, 2021

KEggensperger Apr 13, 2021


		There are now two possible solutions:

		1. Use parallel Auto-sklearn: if you use Auto-sklean in parallel, it defaults to ``forkserver``


		We therefore suggest using one of the above settings by default.

		Auto-sklearn is extremely memory hungry in a sequential setting

Add faq #1109

Add faq #1109

Conversation

mfeurer commented Mar 26, 2021

codecov bot commented Mar 26, 2021 • edited Loading

Codecov Report

KEggensperger Apr 13, 2021

Choose a reason for hiding this comment

KEggensperger Apr 13, 2021

Choose a reason for hiding this comment

KEggensperger Apr 13, 2021

Choose a reason for hiding this comment

KEggensperger Apr 13, 2021

Choose a reason for hiding this comment

KEggensperger Apr 13, 2021

Choose a reason for hiding this comment

KEggensperger Apr 13, 2021

Choose a reason for hiding this comment

KEggensperger Apr 13, 2021

Choose a reason for hiding this comment

codecov bot commented Mar 26, 2021 •

edited

Loading