Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not Using the Entire Test Dataset Results in Abnormal Metrics #129

Open
Zian-Xu opened this issue Dec 19, 2024 · 0 comments
Open

Not Using the Entire Test Dataset Results in Abnormal Metrics #129

Zian-Xu opened this issue Dec 19, 2024 · 0 comments

Comments

@Zian-Xu
Copy link

Zian-Xu commented Dec 19, 2024

In PatchTST_supervised\data_provider\data_factory.py, at line 17:

    if flag == 'test':
        shuffle_flag = False
        drop_last = True
        batch_size = args.batch_size
        freq = args.freq

Take the illness dataset as an example. According to the dataset split method, the test set contains 170 samples.

However, due to the setting batch_size = args.batch_size, which is set to batch_size=16 in illness.sh, and drop_last = True, the last 10 samples in the test set are excluded from the evaluation. This leads to abnormal test metrics.

For example, in my reproduced results, the evaluation metrics are significantly inflated:

  • Metrics excluding the last 10 samples: mse:1.389, mae:0.766, rse:0.569
  • Metrics including the last 10 samples: mse:1.945, mae:0.855, rse:0.674

Could you please clarify the rationale behind this setting? Thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant