Scaling/Rescaling needed for regression outputs #73

EyalWirsansky · 2020-10-23T12:31:40Z

Is your feature request related to a problem? Please describe.
Currently scaling of training data (via transformations such as MeanStdDevTransformation) only applies to the features but not to the outputs; however regression outputs need to be scaled as well for some models to train properly. Specifically, this causes the RBF SVM to perform much more poorly comparing to scikit-learn, where it's easy to scale the entire dataset.

Describe the solution you'd like
Adding the option to scale the output of a training dataset in addition to the features when training a regressor. This also means that the output of the regressor will be inverse-scaled when performing predictions.

Describe alternatives you've considered
'Manually' scaling and inverse-scaling outside Tribuo's training/prediction flow. This is cumbersome, and in addition will not be included in the provenance.

Additional context

Craigacp · 2020-10-27T17:53:59Z

Thanks for the report, there are a few ways we could integrate this support. Wrapping it via a StandardisingTrainer similar to the TransformTrainer would induce another dataset copy, whereas integrating it directly into the affected regression trainers would be a bunch more code. We'll have a look at figure out which way seems most efficient.

Craigacp · 2020-11-06T23:34:34Z

There's a prototype for LibSVM models here - https://github.com/oracle/tribuo/tree/regression-rescaling. We're currently trying to figure out if there is a way to build that into all regression models without too much repeated code (and even if it's necessary for things like XGBoost).

Craigacp · 2021-03-04T16:27:46Z

After some checking it didn't seem necessary in models other than LibSVM regressors, so we just added an option to that trainer.

EyalWirsansky added the enhancement New feature or request label Oct 23, 2020

Craigacp mentioned this issue Feb 8, 2021

Adds support for standardising regression values in LibSVM #113

Merged

Craigacp closed this as completed in #113 Mar 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scaling/Rescaling needed for regression outputs #73

Scaling/Rescaling needed for regression outputs #73

EyalWirsansky commented Oct 23, 2020

Craigacp commented Oct 27, 2020

Craigacp commented Nov 6, 2020

Craigacp commented Mar 4, 2021

Scaling/Rescaling needed for regression outputs #73

Scaling/Rescaling needed for regression outputs #73

Comments

EyalWirsansky commented Oct 23, 2020

Craigacp commented Oct 27, 2020

Craigacp commented Nov 6, 2020

Craigacp commented Mar 4, 2021