-
Notifications
You must be signed in to change notification settings - Fork 344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training Data Selection Logic changed #81
Comments
Hey, yes you're right that we haven't updated the guide. Sorry for that. Currently, we've removed the time series validation as preparation for our coming release: the "rolling window update" for continuous reporting. Before, you could use for example 80% data to build model and 20% as out-of-sample to test and we've reported the R-squared on the test. However, when considering updating the result constantly, we want to always include the latest data into the model. This is a conflict to the time-series validation. In the end, we understand that Robyn is more of a decomposition tool than time series forecast. Also, the potential overfitting issue is already addressed by ridge regression. Therefore we've removed the time-series validation, meaning at the moment the model is built on the entire dataset and only the "train R squared" is reported. With the upcoming update, users will be able to select the window for the modelling and the cadance of update. Hope it helps |
Yes, that helps. Thank you! Looking forward to the rolling window update feature! |
Hei! We have created several Robyn models and our customers are a bit worried about no train test split or cross-validation. I agree with the fact that Ridge regression prevents overfitting, however, I am not sure it is enough. I wonder if Nevergrad should optimize towards R-squared in test data. Even though it is a decomposition model, several functionalities regarding future estimates are key, and already implemented (i.e. Budget Allocator) So, there is a predictive utility in the model. Customers prefer a model that can generalize, or at least we can prove it does. Are you guys considering including that functionality again? Maybe as an option? Thank you for your amazing work; we are really happy with Robyn and hope to start using it soon. |
Issue
The training size logic in the code seems to be modified and does not match with the Step by Step guide : https://facebookexperimental.github.io/Robyn/docs/step-by-step-guide
Can you give some insight on how the training data selection works now?
The text was updated successfully, but these errors were encountered: