You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First of all, thank you for your work and the really interesting autosklearn package.
In AutoSklearnRegressor (maybe AutoSklearnClassifier too), when memory_limit is low enough to force autosklearn to decimate the training set, a resampling strategy like GroupKFold fails because the argument groups, which is a vector of group indices for each example in the training set, is not decimated accordingly. In essence, the following line fails:
because y.shape[0] refers to the decimated training set, while np.shape(self.resampling_strategy_args['groups'])[0] refers to the original (non decimated) training set.
As a consequence, for large training sets, this problem occurs basically always, preventing to use of group-based resampling strategies.
The text was updated successfully, but these errors were encountered:
While we havn't directly fixed this issue, you can now disable dataset compression or provide more fine tune control over how it's done. This was implemented in #1341 and documented better in #1386. It is in the development branch and will be out in our next release. I will close this for now as keeping track of indices given the multitude of possible resampling implementations seems that it would be quite a difficult task and out of scope for what we can manage.
Dear Developers,
First of all, thank you for your work and the really interesting
autosklearn
package.In
AutoSklearnRegressor
(maybeAutoSklearnClassifier
too), whenmemory_limit
is low enough to forceautosklearn
to decimate the training set, a resampling strategy like GroupKFold fails because the argumentgroups
, which is a vector of group indices for each example in the training set, is not decimated accordingly. In essence, the following line fails:auto-sklearn/autosklearn/evaluation/train_evaluator.py
Line 994 in 275d0d6
because
y.shape[0]
refers to the decimated training set, whilenp.shape(self.resampling_strategy_args['groups'])[0]
refers to the original (non decimated) training set.As a consequence, for large training sets, this problem occurs basically always, preventing to use of group-based resampling strategies.
The text was updated successfully, but these errors were encountered: