Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow unique values for future regressor of one time series in global modeling #1146

Merged
merged 6 commits into from
Feb 8, 2023

Conversation

judussoari
Copy link
Collaborator

🔬 Background

Fixes #1130
Currently, NeuralProphet does not allow a future regressor to only contain unique values. When global modeling, however, there are reasonable cases in which some IDs may have only unique values, while others don't.

🔮 Key changes

Instead of throwing an Error immediately, we first check whether we have unique values cross all IDs. If not, we give a warning.

📋 Review Checklist

  • I have performed a self-review of my own code.
  • I have commented my code, added docstrings and data types to function definitions.
  • I have added pytests to check whether my feature / fix works.

@judussoari judussoari added this to the Release 0.5.2 milestone Feb 4, 2023
@judussoari judussoari self-assigned this Feb 4, 2023
@codecov-commenter
Copy link

codecov-commenter commented Feb 4, 2023

Codecov Report

Merging #1146 (b7d666d) into main (55c23bb) will increase coverage by 0.02%.
The diff coverage is 100.00%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@            Coverage Diff             @@
##             main    #1146      +/-   ##
==========================================
+ Coverage   90.17%   90.19%   +0.02%     
==========================================
  Files          30       30              
  Lines        4966     4967       +1     
==========================================
+ Hits         4478     4480       +2     
+ Misses        488      487       -1     
Impacted Files Coverage Δ
neuralprophet/df_utils.py 94.70% <100.00%> (+<0.01%) ⬆️
neuralprophet/np_types.py 100.00% <0.00%> (+8.33%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@github-actions
Copy link

github-actions bot commented Feb 4, 2023

Model Benchmark

Benchmark Metric main current diff
PeytonManning MAE_val 0.64636 0.64636 0.0%
PeytonManning RMSE_val 0.79276 0.79276 0.0%
PeytonManning Loss_val 0.01494 0.01494 0.0%
PeytonManning MAE 0.42701 0.42701 0.0%
PeytonManning RMSE 0.57032 0.57032 0.0%
PeytonManning Loss 0.00635 0.00635 0.0%
PeytonManning time 11.4659 11.44 -0.23%
YosemiteTemps MAE_val 1.72949 1.72949 0.0%
YosemiteTemps RMSE_val 2.27386 2.27386 0.0%
YosemiteTemps Loss_val 0.00096 0.00096 0.0%
YosemiteTemps MAE 1.45189 1.45189 0.0%
YosemiteTemps RMSE 2.16631 2.16631 0.0%
YosemiteTemps Loss 0.00066 0.00066 0.0%
YosemiteTemps time 91.0026 92.11 1.22%
AirPassengers MAE_val 15.4077 15.4077 0.0%
AirPassengers RMSE_val 19.5099 19.5099 0.0%
AirPassengers Loss_val 0.00196 0.00196 0.0%
AirPassengers MAE 9.86947 9.86947 0.0%
AirPassengers RMSE 11.7222 11.7222 0.0%
AirPassengers Loss 0.00057 0.00057 0.0%
AirPassengers time 4.02355 4.16 3.39% ⚠️
Model training plots

Model Training

PeytonManning

YosemiteTemps

AirPassengers

Copy link
Owner

@ourownstory ourownstory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you for the quick fix!
I just had a question about the change in one line, otherwise good to merge from my end.

@@ -217,7 +217,7 @@ def data_params_definition(
raise ValueError(f"Regressor {reg} not found in DataFrame.")
data_params[reg] = get_normalization_params(
array=df[reg].values,
norm_type=config_regressors[reg].normalize,
norm_type=config_regressors[reg].normalize if len(df[reg].unique()) > 1 else "off",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we operating over all panel time series or on a single one here?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind explaining the reason for this change?
Thank you!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ourownstory Sure. Usually, we would normalize the data in the future reg column. This will usually throw an error. With a time series that only contains unique values (which we now enable as long as it's not the case for every ID), we would like to skip normalization though. This is on single time series level here.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I am just confused. Here are my thoughts:

If local normalization is enabled, this would still be an issue - we would need to ignore the variable on a local level.

If it's globally normalized, it would be ok, but then we would not need to switch it off on a local level?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@judussoari if local normalization is enabled, even if doing global modeling, we still need to throw and error.
Only special case is Global normalization and global modeling - then a series may have a single value as long as all series together have more than one value.

"Encountered future regressor with only unique values in training set across all IDs."
"Automatically removed variable."
)
regressors_to_remove.append(reg)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@ourownstory ourownstory added the status: needs update PR has outstanding comment(s) or PR test(s) that need to be resolved label Feb 8, 2023
judussoari and others added 5 commits February 8, 2023 14:13
*only if global modeling and global normalization

*and only if not all time series have the same unique future reg value
*only if global modeling and global normalization

*and only if not all time series have the same unique future reg value
*only if global modeling and global normalization

*and only if not all time series have the same unique future reg value
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: needs update PR has outstanding comment(s) or PR test(s) that need to be resolved
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automatic Future Regressor Removal with Global Model
3 participants