New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

many2one #36

Open

oneJue wants to merge 5 commits into master from qwh_dev

Collaborator

oneJue commented Oct 30, 2024

Attempt to include many other variables to predict the current multivariate data.

oneJue added 5 commits

October 30, 2024 18:59


          Multivariate prediction

98e00f8


          many2one


          many2one

de9836a


          Merge remote-tracking branch 'origin/master' into qwh_dev

7d0e623


          many2one

0ff2144

luckiezhou reviewed

View reviewed changes

ts_benchmark/baselines/fits/fits.py

@@ @@ -161,7 +161,7 @@ def _padding_time_stamp_mark( @@
                       padding_mark = get_time_mark(whole_time_stamp, 1, self.config.freq)
                       return padding_mark
-                  def validate(self, valid_data_loader, criterion):
+                  def validate(self, valid_data_loader, covariate, criterion):

Collaborator

luckiezhou Oct 31, 2024

I suggest renaming this parameter to 'covariates', to be consistent with the corresponding parameter in ModelBase

ts_benchmark/baselines/fits/fits.py

+                              : -covariate["exog"].shape[1]
+                              if covariate["exog"].shape[1] > 0
+                              else None,
+                          ]

Collaborator

luckiezhou Oct 31, 2024

We assume that target and output have both target series and exog in them? This looks weird as the exog should be passed in as a part of 'covariates'. Currently, the 'covariates' parameters is not actually used...
In this case, you should at least keep the parameters precise and clear: pass in 'series_dim: int' rather than 'covariates: Dict'.

ts_benchmark/baselines/fits/fits.py

@@ @@ -194,7 +206,7 @@ def validate(self, valid_data_loader, criterion): @@
                       return total_loss
                   def forecast_fit(
-                      self, train_valid_data: pd.DataFrame, train_ratio_in_tv: float
+                      self, train_valid_data: pd.DataFrame, covariate: dict, train_ratio_in_tv: float

Collaborator

luckiezhou Oct 31, 2024

the parameter name is different from that defined in the base class, please do a global check to see if the interface is consitent

ts_benchmark/baselines/fits/fits.py

@@ @@ -203,6 +215,9 @@ def forecast_fit( @@
                       :param train_ratio_in_tv: Represents the splitting ratio of the training set validation set. If it is equal to 1, it means that the validation set is not partitioned.
                       :return: The fitted model object.
                       """
+                      train_valid_data = pd.concat(
+                          [train_valid_data, covariate["exog"]], axis=1
+                      )

Collaborator

luckiezhou Oct 31, 2024

please consider the case when exog does not exist

ts_benchmark/baselines/fits/fits.py

@@ @@ -421,6 +448,9 @@ def batch_forecast( @@
                       input_data = batch_maker.make_batch(self.config.batch_size, self.config.seq_len)
                       input_np = input_data["input"]
+                      input_np = np.concatenate(
+                          (input_np, input_data["covariates"]["exog"]), axis=2
+                      )

Collaborator

luckiezhou Oct 31, 2024

please consider the case when exog does not exist

ts_benchmark/utils/data_splitter.py

+                  Splits a DataFrame into target and remaining parts based on the target_channel configuration.
+                  :param df: The input DataFrame to be split.
+                  :param target_channel: Configuration for selecting target columns. It can include integers (positive or negative) and lists of two integers representing slices. If set to None, all columns are selected as target columns, and the remaining DataFrame is empty.

Collaborator

luckiezhou Oct 31, 2024

this line is too long.
and you'd better provide some examples on what 'target_channel' can be.

ts_benchmark/utils/data_splitter.py

+                  def parse_target_channel(
+                      target_channel: Optional[List], num_columns: int
+                  ) -> List[int]:

Collaborator

luckiezhou Oct 31, 2024

if this function is independent of the outer scope, I suggest moving it to the global scope, while setting it to be private.

ts_benchmark/utils/data_splitter.py

+                      remaining_df = df.iloc[:, remaining_columns]
+                  else:
+                      # Create an empty DataFrame with the same index as df and zero columns
+                      remaining_df = pd.DataFrame(index=df.index)

Collaborator

luckiezhou Oct 31, 2024 •

edited

Loading

df.iloc[:, []] works as expected, so this if-else is not necessary.

ts_benchmark/utils/data_splitter.py

+                                  raise IndexError(
+                                      f"target_channel configuration error: Column index {item} is out of range (total columns: {num_columns})."
+                                  )
+                          elif isinstance(item, list) and len(item) == 2:

Collaborator

luckiezhou Oct 31, 2024

I think we should also allow a tuple with two elements

ts_benchmark/utils/data_splitter.py

+                              )
+                      # Remove duplicates while preserving order
+                      target_columns_unique = list(dict.fromkeys(target_columns))

Collaborator

luckiezhou Oct 31, 2024

dict keeps order only after python3.6, shall we consider the compatibility with older python versions?

Collaborator

luckiezhou commented Oct 31, 2024

@qiu69 Do we apply end-to-end tests before submitting new PRs now? Please add some new test cases to the script whenever a new feature is added.

luckiezhou reviewed

View reviewed changes

ts_benchmark/utils/data_splitter.py

		@@ -0,0 +1,91 @@
		import pandas as pd

Collaborator

luckiezhou Oct 31, 2024

please reformat this file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet