推理时的输入数据处理 #20

L1c4AI · 2024-10-28T10:40:28Z

您的工作非常有趣，实验分析也非常完善，我对于部分细节存在一定的疑惑：

推理时，STD-MAE 模型需要将 long-range input 分别通过 T/S Encoder 去生成对应的表示，我阅读了您 https://github.com/Jimmy-7664/STD-MAE/blob/main/stdmae/stdmae_data/forecasting_dataset.py 这部分的代码，其中：

    def __getitem__(self, index: int) -> tuple:
        """Get a sample.

        Args:
            index (int): the iteration index (not the self.index)

        Returns:
            tuple: (future_data, history_data), where the shape of each is L x N x C.
        """

        idx = list(self.index[index])

        history_data = self.data[idx[0]:idx[1]]     # 12
        future_data = self.data[idx[1]:idx[2]]      # 12
        if idx[1] - self.seq_len < 0:
            long_history_data = self.mask
        else:
            long_history_data = self.data[idx[1] - self.seq_len:idx[1]]     # 11

        return future_data, history_data, long_history_data

给出了 long_history_data 的计算方式，这里的 seq_len 有什么约束吗？比如需要和 pre-train 时保持一致或者应该要大于推理阶段原始序列的长度 T ？以及在 idx[1] - self.seq_len < 0 的分支中，long_history_data 被设为 self.mask，但是这个变量初始化为 torch.zeros，是指在数据片段不够长时不使用 pre-train 的模型？此外，您对比的其他模型应该不能直接处理 long_history_data 的数据，可以认为在性能测试的实验中 STD-MAE 使用了更多的输入数据吗？

The text was updated successfully, but these errors were encountered:

Jimmy-7664 · 2024-10-29T10:52:39Z

感谢您的支持：）
seq_len是long_history_data的长度即预训练时所使用历史信息的长度,在这里一般为过去几天/几周的数据（具体的数值可以参考文章的内容），这里所使用的是和预训练阶段保持一致的长度，但是这是一个可以变化的超参数，如果您想进行一些改动可以对它进行进一步的调整。
Forecastingdataset是在使用下游预测器进行预测任务时使用的dataset, 这里的long_history_data是给预训练的T-MAE和S-MAE来生成表征的。但对于测试集的前某些sample，以seq_len=864举例，对于前864个时间步的测试集数据来说他们所需要的long_history_data的信息是不完全的，因此这里将他们的历史信息设为了0. 但是如您所想，即使历史信息不完全的情况下仍然可以将部分历史信息加入进来，这一操作应该会对结果有正向的影响。
对于STD-MAE是否使用了更多的输入数据，虽然在预测部分进行了切片对齐，但我想答案还是肯定的。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

推理时的输入数据处理 #20

推理时的输入数据处理 #20

L1c4AI commented Oct 28, 2024

Jimmy-7664 commented Oct 29, 2024 •

edited

Loading

推理时的输入数据处理 #20

推理时的输入数据处理 #20

Comments

L1c4AI commented Oct 28, 2024

Jimmy-7664 commented Oct 29, 2024 • edited Loading

Jimmy-7664 commented Oct 29, 2024 •

edited

Loading