Not Using the Entire Test Dataset Results in Abnormal Metrics #129

Zian-Xu · 2024-12-19T16:02:02Z

In PatchTST_supervised\data_provider\data_factory.py, at line 17:

    if flag == 'test':
        shuffle_flag = False
        drop_last = True
        batch_size = args.batch_size
        freq = args.freq

Take the illness dataset as an example. According to the dataset split method, the test set contains 170 samples.

However, due to the setting batch_size = args.batch_size, which is set to batch_size=16 in illness.sh, and drop_last = True, the last 10 samples in the test set are excluded from the evaluation. This leads to abnormal test metrics.

For example, in my reproduced results, the evaluation metrics are significantly inflated:

Metrics excluding the last 10 samples: mse:1.389, mae:0.766, rse:0.569
Metrics including the last 10 samples: mse:1.945, mae:0.855, rse:0.674

Could you please clarify the rationale behind this setting? Thank you very much.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not Using the Entire Test Dataset Results in Abnormal Metrics #129

Not Using the Entire Test Dataset Results in Abnormal Metrics #129

Zian-Xu commented Dec 19, 2024

Not Using the Entire Test Dataset Results in Abnormal Metrics #129

Not Using the Entire Test Dataset Results in Abnormal Metrics #129

Comments

Zian-Xu commented Dec 19, 2024