Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactored fit() interface #1039

Merged
merged 21 commits into from
Dec 7, 2022
Merged

Refactored fit() interface #1039

merged 21 commits into from
Dec 7, 2022

Conversation

karl-richter
Copy link
Collaborator

@karl-richter karl-richter commented Dec 5, 2022

🔬 Background

  • We want to provide a user with the ability to activate / deactivate some of the non-required features during training. This includes checkpointing, logging of progress via the progress bar as well as the calculation of metrics during training.
  • In the initial version of NeuralProphet, the minimal used a separate function to _train() called _train_minimal() that did not contain all of the above mentioned features and was "minimal".
  • With the transition to Lightning, the training procedure is abstracted into the model and a lot simpler. Thus, we removed the _train() function and added minimal as a parameter to it.
  • With the performance optimization of Lightning, the demand raised to control the training features individually. Thus, this PR supports controlling this behavior more granularly by using the progress, metrics and checkpointing parameter. The minimal mode further persists, but only remains a shortcut to set the other parameters to false.

🔮 Key changes

  • Added a deprecation warning for the collect_metrics parameter in __init__ since this information can only be provided in the fit method in the future
  • Added the metrics and checkpointing parameter to the fit() and subsequent functions
  • Small refactorings and clean ups

📋 Review Checklist

  • I have performed a self-review of my own code.
  • I have commented my code, added docstrings and data types to function definitions.
  • I have added pytests to check whether my feature / fix works.

Please make sure to follow our best practices in the Contributing guidelines.

@karl-richter
Copy link
Collaborator Author

Hey @ourownstory @noxan, this PR would bring some interface changes (remove minimal, add metrics and checkpointing). Should we provide the metrics and checkpointing arg in NeuralProphet or in the fit() function?

@karl-richter karl-richter self-assigned this Dec 5, 2022
@noxan noxan added this to the Release 0.5.0 milestone Dec 5, 2022
@noxan
Copy link
Collaborator

noxan commented Dec 5, 2022

@karl-richter I like your approach, also agree with moving the metrics and checkpoints to where they actually matter (in the fit function) - would also make it very clear if people disabled metrics that they cannot expect that as return value.

@karl-richter
Copy link
Collaborator Author

karl-richter commented Dec 6, 2022

Interface Change
As discussed with @noxan and @ourownstory, we changed the interface of the fit() method to prepare for functionalities like continue_training that require to change the number of epochs etc. between training runs. Added a deprecation warning to collect_metrics in __init__.

# 0.4.2
def fit(
        self,
        df,
        freq="auto",
        validation_df=None,
        progress="bar",
        minimal=False
)

# 0.5.0
def fit(
        self,
        df: pd.DataFrame,
        freq: str = "auto",
        validation_df: pd.DataFrame = None,
        # Control the training process (required once we support continuous training, since we might just want to train few additional epochs)
        epochs: int = None,
        batch_size: int = None,
        learning_rate: float = None,
        early_stopping: bool = False,  # new with Lightning
        # Control the verbosity of the training, `minimal=True` deactivates all of the below        
        minimal: bool = False,
        metrics: bool = None,  # moved from __init__
        progress="bar", 
        checkpointing: bool = False,  # new with Lightning
        # Activate continue training
        continue_training: bool = False,  # new with Lightning
        num_workers=0,  # distributed training
    )

@codecov-commenter
Copy link

codecov-commenter commented Dec 6, 2022

Codecov Report

Merging #1039 (de8686d) into main (9ead864) will decrease coverage by 0.21%.
The diff coverage is 81.48%.

@@            Coverage Diff             @@
##             main    #1039      +/-   ##
==========================================
- Coverage   90.20%   89.98%   -0.22%     
==========================================
  Files          21       21              
  Lines        4757     4783      +26     
==========================================
+ Hits         4291     4304      +13     
- Misses        466      479      +13     
Impacted Files Coverage Δ
neuralprophet/configure.py 90.07% <ø> (-0.04%) ⬇️
neuralprophet/utils_metrics.py 65.21% <ø> (ø)
neuralprophet/forecaster.py 87.87% <74.41%> (-0.68%) ⬇️
neuralprophet/utils.py 81.21% <75.00%> (-1.53%) ⬇️
neuralprophet/logger.py 97.36% <100.00%> (-2.64%) ⬇️
neuralprophet/time_net.py 90.80% <100.00%> (+0.01%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@github-actions
Copy link

github-actions bot commented Dec 6, 2022

cbca032

Model Benchmark

Benchmark Metric main current diff
AirPassengers MAE_val 15.2698 15.2698 0.0%
AirPassengers RMSE_val 19.4209 19.4209 0.0%
AirPassengers Loss_val 0.00195 0.00195 0.0%
AirPassengers MAE 9.82902 9.82902 0.0%
AirPassengers RMSE 11.7005 11.7005 0.0%
AirPassengers Loss 0.00056 0.00056 0.0%
AirPassengers time 5.63621 4.44 -21.22% 🎉
PeytonManning MAE_val 0.64636 0.64636 0.0%
PeytonManning RMSE_val 0.79276 0.79276 0.0%
PeytonManning Loss_val 0.01494 0.01494 0.0%
PeytonManning MAE 0.42701 0.42701 0.0%
PeytonManning RMSE 0.57032 0.57032 0.0%
PeytonManning Loss 0.00635 0.00635 0.0%
PeytonManning time 15.2456 12.11 -20.57% 🎉
YosemiteTemps MAE_val 1.72948 1.72949 0.0%
YosemiteTemps RMSE_val 2.27386 2.27386 0.0%
YosemiteTemps Loss_val 0.00096 0.00096 0.0%
YosemiteTemps MAE 1.45189 1.45189 0.0%
YosemiteTemps RMSE 2.16631 2.16631 0.0%
YosemiteTemps Loss 0.00066 0.00066 0.0%
YosemiteTemps time 118.759 104.7 -11.84% 🎉
Model training plots

Model Training

PeytonManning

YosemiteTemps

AirPassengers

@karl-richter karl-richter changed the title Refactored minimal mode Refactored fit() interface Dec 6, 2022
@karl-richter karl-richter added the status: needs review PR needs to be reviewed by Reviewer(s) label Dec 7, 2022
@karl-richter
Copy link
Collaborator Author

Added preparation for distributed training on request by @judussoari. This includes the num_workers param in fit()

@noxan
Copy link
Collaborator

noxan commented Dec 7, 2022

@karl-richter Many thanks for the great work. I'm still working on the review. For the next pull requests it would be great to split this into several smaller ones, as it is hard to review everything at once and especially to keep track which changes relate to what...

  1. Replace minimal mode by checkpoints and metrics
  2. Refactor fit method to allow continue training with other parameters
  3. Prepare support for num_workers
  4. A changed checkpoints implementation
  5. Moving the progress bar implementation
  6. ...

Happy to have a chat with you to explain this in detail, I'll try to conclude the review soon :)

neuralprophet/forecaster.py Show resolved Hide resolved
neuralprophet/forecaster.py Outdated Show resolved Hide resolved
neuralprophet/forecaster.py Outdated Show resolved Hide resolved
neuralprophet/forecaster.py Show resolved Hide resolved
neuralprophet/forecaster.py Outdated Show resolved Hide resolved
neuralprophet/forecaster.py Outdated Show resolved Hide resolved
neuralprophet/forecaster.py Outdated Show resolved Hide resolved
neuralprophet/forecaster.py Outdated Show resolved Hide resolved
neuralprophet/utils.py Show resolved Hide resolved
neuralprophet/utils.py Show resolved Hide resolved
@noxan
Copy link
Collaborator

noxan commented Dec 7, 2022

@karl-richter Overall a great set of changes. Might be best to have a short call on this. We should consider extracting changes into separate PRs and then review and merge them individually.

Especially with the release of 0.5.0 scheduled today we should focus on changes necessary to release the current version and introduce further changes only later.

@noxan noxan added status: needs update PR has outstanding comment(s) or PR test(s) that need to be resolved and removed status: needs review PR needs to be reviewed by Reviewer(s) labels Dec 7, 2022
@karl-richter
Copy link
Collaborator Author

@noxan I adressed most of our comments:

  • renamed metrics.py to utils_metrics.py
  • renamed collect_metrics to metrics_enabled in time_net.py
  • renamed checkpointing to checkpointing_enabled (and similar for metrics, progress_bar etc.)
  • fixed non-pythonic return in _train() (no more if else for return)

@karl-richter karl-richter requested a review from noxan December 7, 2022 23:51
@karl-richter karl-richter added status: needs review PR needs to be reviewed by Reviewer(s) and removed status: needs update PR has outstanding comment(s) or PR test(s) that need to be resolved labels Dec 7, 2022
Copy link
Collaborator

@noxan noxan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@noxan noxan merged commit cbca032 into main Dec 7, 2022
@noxan noxan deleted the refactor/minimal branch December 7, 2022 23:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: needs review PR needs to be reviewed by Reviewer(s)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants