Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/historical_forecasts callable retrain argument #1675

Merged
merged 36 commits into from
Apr 10, 2023

Conversation

madtoinou
Copy link
Collaborator

@madtoinou madtoinou commented Mar 27, 2023

Fixes #1655.

Summary

The retrain argument of historical_forecast() was not properly handled when it was a Callable, in order to support it appropriately the following changes has been made:

  • the time index that can be used in "training" and "prediction" mode are computed separately, and if retrain is a Callable, the longest is used to build the iterator
  • the retrain function is systematically called before initiating the training round, using the "dataset that would be used for training"
  • the retrain function has a fixed signature with the following positional arguments: [counter, pred_time, train_series, past_covariates, future_covariates]

And the explanation behind the fact that any value <= 3 would mask the problem is linked to the fact that model.min_train_series_length takes the maximum between this constant and the output_chunk_length. Since this value could be longer than the model.min_predict_series_length, the Callable retrain was slicing the predictable time index too aggressively.

Other Information

I added some tests to cover the models with:

  • "lags + future covariates, output_chunk_length = 1"
  • "lags + future covariates, output_chunk_length > 3"
  • "future covariates, output_chunk_length > 3"

But I am not entirely sure of the expected forecast's length for the last 2 scenarios, I need to run some additional tests.

Update 10-04-2023 (@dennisbader):

  • move around start and train_length logic to account for all inputs
  • improved error messages for start param sanity checks
  • raises warning when valid start param but not in predictable/trainable index
  • improvements to the way we handle train length, raises warning when too large
  • option to disable warnings
  • sets min samples required for torch models to 1 (only regression models require minimum of 2 samples)
  • update docs for historical_forecasting, backtest, gridsearch, residuals

@codecov-commenter
Copy link

codecov-commenter commented Mar 28, 2023

Codecov Report

Patch coverage: 90.74% and project coverage change: -0.19 ⚠️

Comparison is base (e4908f7) 94.18% compared to head (e3e90fa) 93.99%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1675      +/-   ##
==========================================
- Coverage   94.18%   93.99%   -0.19%     
==========================================
  Files         125      125              
  Lines       11393    11432      +39     
==========================================
+ Hits        10730    10746      +16     
- Misses        663      686      +23     
Impacted Files Coverage Δ
darts/utils/__init__.py 100.00% <ø> (ø)
darts/utils/timeseries_generation.py 96.15% <ø> (ø)
darts/models/forecasting/ensemble_model.py 90.56% <66.66%> (-1.44%) ⬇️
darts/models/forecasting/forecasting_model.py 94.53% <88.00%> (-1.82%) ⬇️
darts/models/forecasting/dlinear.py 100.00% <100.00%> (ø)
darts/models/forecasting/regression_model.py 97.15% <100.00%> (+0.02%) ⬆️
darts/utils/utils.py 91.24% <100.00%> (+0.72%) ⬆️

... and 10 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@dennisbader dennisbader marked this pull request as ready for review March 31, 2023 09:23
@dennisbader dennisbader self-requested a review as a code owner March 31, 2023 09:23
Copy link
Collaborator

@dennisbader dennisbader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @madtoinou and thanks for getting the retrain callable to work properly 🚀 🥳

I left some minor suggestions, mainly about adapting the error messages a bit, trying to avoid creating new time series as much as we can, and fixes to some potential issues.

Btw, @dumjax and I were working on the last historical_forecasts bug fixes, that should will come in parallel with this PR. Then we're finally ready to put everything together for the release, and concentrate on optimizing the method for the next release :)

Thanks for all the great work so far!

darts/models/forecasting/forecasting_model.py Outdated Show resolved Hide resolved
darts/models/forecasting/forecasting_model.py Outdated Show resolved Hide resolved
darts/models/forecasting/forecasting_model.py Outdated Show resolved Hide resolved
darts/models/forecasting/forecasting_model.py Outdated Show resolved Hide resolved
darts/models/forecasting/forecasting_model.py Outdated Show resolved Hide resolved
darts/models/forecasting/forecasting_model.py Outdated Show resolved Hide resolved
darts/models/forecasting/forecasting_model.py Outdated Show resolved Hide resolved
darts/models/forecasting/forecasting_model.py Show resolved Hide resolved
dennisbader and others added 7 commits April 2, 2023 19:11
…pport of situation where training is impossible, updated documentation about the start argument of historical_forecasts, create dummy support ts when using only future covariate (first timestamp of the input series is predictable)
…ts outputs for both retrain=True and retrain=False
@madtoinou
Copy link
Collaborator Author

After a lot of experimentation with the timestamps of the output of historical_forecasts, I identified some bugs:

  • for models using positive lags_future_covariates only: the first predictable timestamp is the first value of the input series, requires a dummy support ts for the predict method.
  • the timestamp shift (min_timestamp_predict) should be affected by the lags in the past only
  • the unittest with the "manual" boundaries were not allowing for enough granularity: the output_chunk_length is relevant only when retrain=True and cannot be used interchangeably with the largest positivelags_future_covariate
  • the sanity check for historical_forecasts were not supporting models using only positive lags_future_covariates when retrain=False

@madtoinou madtoinou requested review from dennisbader and dumjax April 4, 2023 09:37
Copy link
Collaborator

@dennisbader dennisbader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @madtoinou , that looks really good :)

I think we're in the final round of review and can merge soon! 🚀

Just had a couple of minor suggesetions.

darts/models/forecasting/forecasting_model.py Show resolved Hide resolved
darts/models/forecasting/forecasting_model.py Outdated Show resolved Hide resolved
darts/models/forecasting/forecasting_model.py Outdated Show resolved Hide resolved
darts/models/forecasting/forecasting_model.py Outdated Show resolved Hide resolved
darts/models/forecasting/forecasting_model.py Outdated Show resolved Hide resolved
@@ -889,6 +984,25 @@ def historical_forecasts(
# use retrained model if `retrain` is not training every step
model = model if model is not None else self

# slice the series for prediction without retraining
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment about lines 975 - 983 (the ValueError "retrain is False...")

With the current implementation, it can be that there will be a prediction before training, even if the retrain_callable always returns True.

So for this message to work, we would need a global counter _counter, and train specific counter _counter_train. The train counter will count each time pred_time is in historical_forecasts_time_index_train. So it is only 0 when we are in the prediction phase before the first possible train index. Then we can check something like below (I'm not sure it covers all the cases yet)

  • catch case when first iteration is prediction and it's before the first possible train index. -> suggest different start date (the first possible date that will work), or retrain=True | int, fit model before.
  • catch case when retrain_func actually returns false in the first possible train iteration -> suggest a different retrain value, changing the function so it returns True in first iteration, train model before
# use retrained model if `retrain` is not training every step
model = model if model is not None else self

# model must be fit before the first prediction
if not _counter and not model._fit_called:
    raise_log(
        ValueError(
            f"model has not been fit before in first predict iteration at prediction point (in time) `{pred_time}` "
            f"Either call `fit()` before `historical_forecasts()`, set `retrain=True`, or use a different "
            f"`start` value. The first possible start value is: {min_timestep_train????}"
        ),
        logger,
    )
if not _counter_train and not model._fit_called:
    raise_log(
        ValueError(
            f"`retrain` is `False` in first train iteration at prediction point (in time) `{pred_time}` "
            f"and the model has not been fit before. Either call `fit()` before "
            f"`historical_forecasts()`, or use a different `retrain` value / modify the function "
            f"to return `True` in first iteration."
        ),
        logger,
    )

So we would have to remove the parts about

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reformulating the two cases for eventual other reviewers:

  • the model is not trained for the first "predictable" timestamp -> user should either call fit() before historical_forecasts, modify the retrain argument so that it returns True at least once before/at this timestamp or change the start value to the suggested timestamp so that the model training can be triggered before the prediction round.
  • retrain is False for the first "trainable" timestamp and the model was not trained before historical_forecasts -> user must modify the retrain argument.

Copy link
Collaborator

@dennisbader dennisbader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 🚀 💯 Great job @madtoinou !

@dennisbader dennisbader merged commit ebb9eb6 into master Apr 10, 2023
@dennisbader dennisbader deleted the fix/hist-forecast-callable-retrain branch April 10, 2023 13:29
alexcolpitts96 pushed a commit to alexcolpitts96/darts that referenced this pull request May 31, 2023
* fix: better support for the Callable retrain argument in historical_forecast

* fix: remove unused util function

* fix: adding tests to cover historical forecast without lags but fut cov, and lags + fut cot + output_chunck_length > 3

* fix: updating the local model tests to run with the new constraints on the retrain argument of historical_forecasts

* Apply suggestions from code review

Co-authored-by: Dennis Bader <[email protected]>

* fix: addressing review comments

* feat: testing retrain_func returning str and int

* feat: adding utils method to get lags in the past only

* fix: properly computing the timestamp shift for prediction, better support of situation where training is impossible, updated documentation about the start argument of historical_forecasts, create dummy support ts when using only future covariate (first timestamp of the input series is predictable)

* feat: added unittest checking the start and end of historical_forecasts outputs for both retrain=True and retrain=False

* feat: adressing review comments, better handling of the potential shift between the predictable and trainable timestamps, more detailed error messages

* better handling of the TORCH_AVAILABLE variable (previously, env without torch were not running any historical_forecasts tests)

* fix: better granularity in the unittests error catching

* add some comments to hist forecasts test for interpretability

* minor refactoring

* refactor start handling in historical forecasts

* fix residuals for models that do not require a minimum of two traning samples

* fix an issue with new logic of retrain false at beginning

* fix expected forecasting lengths for TFMs which require only 1 train sample

* move from start error to warning

* handle train length better

* update documentation of hist fc, backtest, and gridsearch

* revert to old sanity checks for start

* improve start santiy check handling and error messages

* fix small condition type in unit tests

* move regression models out of flavor check in unit tests

* make warnings default and use default start if outside of hist fc index

* move start reset in historical forecasting

---------

Co-authored-by: dennisbader <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Historical forecast with retrain=Callable forecasts wrong dates
3 participants