Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why the mean_t and std_t for 7-Scenes are set to zeros(3) and ones(3) in SevenScenes dataset? #35

Closed
ZhouJiaHuan opened this issue Jul 14, 2020 · 6 comments

Comments

@ZhouJiaHuan
Copy link

Hey, I noticed that the pose stats of 7-Scenes, namely the mean_t and std_t are simply set to zeros(3) and ones(3). But when I use the pose stats computed by myself (not equal to zeros(3) and ones(3)) for training and evaluation, the accuracy decreased a lot.

@samarth-robo
Copy link
Contributor

samarth-robo commented Jul 14, 2020

Hi @ZhouJiaHuan. IIRC the mean_t and std_t were only used to avoid making the network predict super large values. This applies to RobotCar which is a large environment, but not so much to 7Scenes. Hence they are set to identity.
About your problem: did you verify that you can train and test properly if you use the identity pose stats?

@ZhouJiaHuan
Copy link
Author

ZhouJiaHuan commented Jul 15, 2020

Thanks for your replying. @samarth-robo

Yea, because my environment is Python3 + PyTorch 1.5.1, I modified some codes for compatibility and then trained first with all the default parameters (including the identity pose stats). The val result is almost same with the paper. The experimental result is as follows:


  • model: MapNet
  • Dataset: 7-Scenes Heads
  • Error in translation: median 0.19 m (0.18 m in paper TABLE 3), mean 0.23 m
  • Error in rotation: median 13.23 degrees (13.25 degrees in paper TABLE 3), mean 13.26 degree

I am not sure whether the different PyTorch version will affect the result a lot. (the pretrained weights of ResNet-34 in Torch 1.5.1 is different with that in Torch 0.4.1)

However, when I substitute the identity pose stats with the value computed from Heads training data, the accuracy dropped a lot. The experimental result is as follows:


  • model: MapNet
  • Dataset: 7-Scenes Heads
  • mean_t: (-0.01829027, -0.13756783, 0.09475311)
  • std_t: (0.42199861, 0.19139078, 0.13966574)
  • Error in translation: median 0.95 m (0.18 m in paper TABLE 3), mean 0.99 m
  • Error in rotation: median 14.47 degrees (13.25 degrees in paper TABLE 3), mean 13.91 degree

And as a reference, I also visualize the training process with TensorboardX

learn_params
train_val_loss

The figure above is the learning parameters in MapNet loss function and the figure below is the training/val loss. As you see, the learning parameters are almost same during training pipeline. But the training loss for computed pose stats (in blue) seems not converge well.

@samarth-robo
Copy link
Contributor

@ZhouJiaHuan OK. And just to confirm, you calculated pose stats similarly to RobotCar, by modifying L100 and L101 of seven_scenes.py as follows?

poses = np.vstack(ps.values())
mean_t = np.mean(poses[:, [3, 7, 11]], axis=0)
std_t = np.std(poses[:, [3, 7, 11]], axis=0)

@ZhouJiaHuan
Copy link
Author

@samarth-robo That's my code to calculate pose stats.

import os
import os.path as osp
import numpy as np
import argparse

DATA_DIR = "data/deepslam_data/"


def parse_args():
    parser = argparse.ArgumentParser(description='Dataset pose statistics')
    parser.add_argument('--dataset', type=str, choices=('SevenScenes', 'UnoRobot'),
                        help='Dataset', required=True)
    parser.add_argument('--scene', type=str,
                        help='Scene name', required=True)
    args = parser.parse_args()
    return args


def parse_poses(dataset, scene):
    data_path = osp.join(DATA_DIR, dataset, scene)
    assert osp.exists(data_path)
    split_file = osp.join(data_path, 'TrainSplit.txt')
    assert osp.exists(split_file)
    with open(split_file, 'r') as f:
        seqs = [int(l.split('sequence')[-1]) for l in f if not l.startswith('#')]
    pss = []
    for seq in seqs:
        seq_dir = osp.join(data_path, 'seq-{:02d}'.format(seq))
        p_filenames = [n for n in os.listdir(osp.join(seq_dir, '.')) if
                       n.find('pose') >= 0]

        frame_idx = np.array(range(len(p_filenames)), dtype=np.int)
        ps = [np.loadtxt(osp.join(seq_dir, 'frame-{:06d}.pose.txt'.
              format(i)))[:3, 3] for i in frame_idx]
        pss.extend(ps)
    return np.array(pss)


if __name__ == "__main__":
    args = parse_args()
    pss_array = parse_poses(args.dataset, args.scene)
    print(pss_array.shape)
    pss_mean = np.mean(pss_array, axis=0)
    pss_std = np.std(pss_array, axis=0)
    print("pose mean = {}".format(pss_mean))
    print("pose std = {}".format(pss_std))`

the code below is equivalent to that in RobotCar:

ps = [np.loadtxt(osp.join(seq_dir, 'frame-{:06d}.pose.txt'. format(i)))[:3, 3] for i in frame_idx]
pss.extend(ps)

@ZhouJiaHuan
Copy link
Author

@samarth-robo hey, I found the causes. It's my inference codes problem. I forgot to change the pose stats during inference phase so that the pose stats is still the identity. I fixed this error and the error for mapnet on heads is 0.20 m, 13.91 degree. Pardon the bother and thanks for your kindness.

@samarth-robo
Copy link
Contributor

@ZhouJiaHuan I was just about to comment that! Glad you found the solution. I'll close this issue now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants