-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document list of datasets meta learning datasets #502
Comments
They indeed have been updated, please find the list of (133) OpenML task IDs in this file: https://github.com/automl/auto-sklearn/blob/master/autosklearn/metalearning/files/accuracy_binary.classification_dense/algorithm_runs.arff |
Do you have a different set of datasets for the metalearning for regression problems? |
Also, in the folder it seems that there are different set of tasks for each metric. Was the metalearner trained upon all the set of tasks? |
There is currently no meta-data for regression.
No, we trained Auto-sklearn with balanced accuracy for each dataset separately. Then, for each combination of metric, target problem (binary, multiclass) and data structure (dense, sparse) we looked for the legal configurations and chose the one for each dataset which performed best given the metric of interest. |
I'm still unclear. For example, if I specify to autosklearn to use 'f1_weighted' will it set the hyperparamters to the same ones from a dataset that is closest sourced from the below files, given that I have multiclass problems? And assuming that that is how this works. I am also confused as to why for example the tasks seem to be the same for f1_weighted binary and f1_weighted multiclass as the tasks seem to point to multiclass problems though they are in binary as well. For example looking at task 2120 which points to dataset 182 shows a multiclass dataset yet this task is in both arffs. https://github.com/automl/auto-sklearn/blob/master/autosklearn/metalearning/files/f1_weighted_multiclass.classification_dense/algorithm_runs.arff |
@mfeurer Sorry, do you mean there is no metalearning for regression currently? |
Yes (assuming that your data is dense).
The tasks are the same for each metric. The difference are the configurations. Configurations are selected for each combination of the target metric and the dataset type. Also, only configurations valid for a certain task are chosen. |
This makes a lot of sense and would be a great addition to the documentation. Thank you! |
This question will be documented in the upcoming FAQ (#1109). |
Hello!
I was looking through the documentation and could not find the list of datasets that were used to train the meta-learning feature of auto-sklearn. The paper supplement lists a set of datasets but I was wondering if those have been updated. (http://ml.informatik.uni-freiburg.de/papers/15-NIPS-auto-sklearn-supplementary.pdf)
@mfeurer
The text was updated successfully, but these errors were encountered: