Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dtam is producing poor reconstruction on ptam poses #21

Open
avanindra opened this issue Oct 16, 2014 · 10 comments
Open

dtam is producing poor reconstruction on ptam poses #21

avanindra opened this issue Oct 16, 2014 · 10 comments

Comments

@avanindra
Copy link

Hi Paul,

I have been trying to integrate dtam with live ptam frames. I have used 2.4.9 experimental branch. I am noticing that dtam is producing poor reconstruction. The reconstruction is quite smooth , but depths are not correct. I apply opencv stereoBM or SGBM on same ptam keyframes , with same poses , I get much better depth maps and reconstruction. I have tried 32 and 64 depth layers , with 30 image per cost volume. I had set near and far values from distance of sparse points of ptam from camera.

I can upload the running code with dataset , if you want to have a look at it.

@anuranbaka
Copy link
Owner

Please do, I won't have time to look at it closely until tomorrow, but I'll try to see if I see anything obvious today and look harder tomorrow.
Also, you should look at commit 638f119 on the experimental branch. It shows spinning views of something very close to the highest quality reconstruction that is possible with the current DTAM implementation. If yours is much worse, you may have a problem in how the dataset was captured or preprocessed, such as not turning off auto exposure, not undistorting the image, or noise higher than the outlier tolerance set in the cost volume construction.

@avanindra
Copy link
Author

I will check the commit you mentioned. The camera I used for my data is "fujifilm finepix 3d" camera , which may not be appropriate for dtam, but what I was expecting is that dtam should produce better result than StereoBM. In a couple of days I would get a grey flea global shutter color camera , I ordered last week , I guess that would be ideal for testing the algorithm , as author themselves used grey flea 2 camera.

For distortion , I have used PTAM distortion model , one parameter barrel distortion , and I undistort the images before sending into DTAM pipeline.

I have checked in the codes in "patm_poses" branch of a fork of 2.4.9 experimental branch. https://github.com/avanindra/OpenDTAM/tree/ptam_poses

I have written the 3d viewer in glfw , I have included the code and libraries as well as ptam libraries , hopefully , you would get no linking error in the first build.

once you run the program , you would have to press 'space bar' twice in some intervals , then the pose would be initialized , afterwards press "D" , to start DTAM reconstruction .

Press 'S' for sparse points display , "P" for dtam reconstruction display , "O" for opencv stereoBM reconstructions display, "I" for frame display , "G" for opencv stereo depthmap.

Following is the link of video sequence ( github was not letting me upload it , so I put it in dropbox. )

https://www.dropbox.com/s/9eyf851qh0k2ifw/DSCF0159.AVI?dl=0

put the video in "Cpp/dataset" directory , for that is the path I have passed from cmake.

[ Also , one more point , even though I have put glfw and ptam codes in separate threads , they all are consuming only one cpu , so pose initialization process may take 10 to 15 seconds sometimes. ]

Thank you for looking into it.

@hustcalm
Copy link
Contributor

@avanindra

Hi there, trying to get your branch "ptam_poses" running up, however got the errors below:

make[2]: *** No rule to make target `../link_libraries/glfw.so', needed by `opendtamdemo'.  Stop.
make[1]: *** [CMakeFiles/opendtamdemo.dir/all] Error 2
make: *** [all] Error 2

Any idea to get this fixed?

Thanks in advance:-)

@anuranbaka
Copy link
Owner

@hustcalm

You will need to get a copy of glfw3 and edit the cmakelists file to point to it. I used the version on http://www.opengl-tutorial.org/ since I already had it available. You should also make the changes below.

@avanindra
I finally figured it out. There are two problems:

  1. The colors were out of range, you need
    udImage.convertTo( image , CV_32FC3 ,1.0/255.0);
    instead of
    udImage.convertTo( image , CV_32FC3 ,1.0/65535.0);
    in opendtamdemo.cpp

  2. The computeAndDisplayPoints(...) function is using the wrong image
    it shoud use:

        Mat base;
        cost.baseImage.download(base);
        float* colorData = (float*)base.data;
    

    and

    colors.push_back( Eigen::Vector3f( colorData[ 2 ]  , colorData[ 1 ]  , colorData[ 0 ] ) );
    

    instead of

    uchar *colorData = colorFrame.data;
    

    and

    colors.push_back( Eigen::Vector3f( colorData[ 2 ] /255.0 , colorData[ 1 ]/255.0  , colorData[ 0 ]/255.0 ) );
    

Sorry I haven't had time to make a proper pull req, my version of your code has diverged a lot in trying to find the problem. I'm trying to get to that.

@hustcalm
Copy link
Contributor

@avanindra

Got the problem fixed by replacing the last line of the CMakeLists.txt by:

target_link_libraries( opendtamdemo  OpenDTAM ${OpenCV_LIBS} ${Boost_LIBRARIES} ${QT_LIBRARIES} visualization3d GLEW cvd ptam the_absolute_path_to_libglfw.so )

Hope it helps:-)

@hustcalm
Copy link
Contributor

@anuranbaka

Thanks, already got the problem fixed and the code compiles and runs.

For the initialization part, I'm sort of using the PTAM sparse initialization as @avanindra did.

BTW, I found Newcombe's PhD thesis is valuable for reference, maybe you guys should also be interested in.

www.doc.ic.ac.uk/~ajd/Publications/newcombe_phd2012.pdf

@avanindra
Copy link
Author

@anuranbaka

Thanks for the reply. I changed the color range as you mentioned. It did improve the reconstruction somewhat. Though I am still not getting accurate reconstruction.

Also , I wanted one clarification regarding cost layer of volume. From the code , it seems like you are assuming each cost layer to be planer , as you are assigning same z value to each pixel at a particular cost layer , while I think the cost layer should be spherical , each pixel at one layer equal distant from the reference camera.

@hustcalm

Sorry , I couldn't respond bit early , I guess I missed reading the github notifications. I somehow missed uploading the glfw libraries in link_library folder. Though , I am glad that you got the code fixed and running.

@anuranbaka
Copy link
Owner

@avanindra
Yes, you're right, the cost volume is divided into planes instead of spheres. This makes the cost calculations much easier. Also, it makes planes tend to remain planar rather than curved, since the regularization tends to pull toward the shape of the cost volume.

For a parallel stereo pair, the planar shape with inverse depth parametrization is optimal in the sense that the sampling distribution matches the error distribution. For multiple nonparallel views the optimal sampling shape is intractable, but should be close to both planar and spherical forms, so I just use the easier planar one. Of course none of this applies to things like ultra fisheye lenses where the pixel density is not constant in the pinhole model.

Anyway, my experience was that the quality of reconstruction had a lot to do with the quality of the ptam tracking, which was finniky to get started well. In particular ptam likes to sample all of its points from a plane, which messes things up. In your sample video, I wait until the bear's nose comes into view and the camera starts to pan downward, which gives ptam a number of points on both floor and desk.

Also, I think I turned off a line of code in the CostVolume.cu that said something like
del=fminf(del,.005)*1/.005f;
to make it not as sensitive to camera auto exposure.

You might want to force the far plane to be at infinity (i.e. 0). Ptam seems to sometimes choose bad near and far planes. I haven't figured out a good heuristic for setting the near plane.

@avanindra
Copy link
Author

@anuranbaka

Hi , a bit late reply from me. I had a point when I said the cost layer should be spherical . It's importance lies in computing the depth derivatives; in which you assume the inverse depth step to be constant , which would be possible only when cost layers are spherical. In case of planer cost layers , the corners would have different inverse depth jump than the middle pixel. Correct me if I am wrong.

@anuranbaka
Copy link
Owner

@avanindra https://github.com/avanindra
Well, that's right if the depth you're using is the literal distance from
the camera's entrance pupil/virtual pinhole, but for most cameras that is
not the natural depth measure. If you consider the way that a pinhole
camera projects onto a planar sensor, you see that:
x_sensor=x_world_f/z_world
y_sensor=y_world_f/z_world
z_sensor=z_world*f/z_world=f <--This line is usually ignored because it is
constant
if the camera is facing down the z axis of the world. Notice that these
equations are linear in x_world, y_world, and 1/z_world. When we refer to
depth, we mean z_world, not the distance from the pinhole. By using
1/z_world, the "inverse depth", all of our equations are linear, and
derivatives are linear. The depthStep in the code is actually a units of
inverse depth, as are all other depth measures. Sorry about that, it's a
fairly common metonym.
In any case, it also means that the cost volume voxels are actually not
cubes in real world space, but frusta with two planar faces and slightly
curved sides.

The depth measure you are proposing would be appropriate if the sensor was
spherical, but practically all cameras have a planar sensor.

On the other hand, extreme fisheye cameras are often designed to
approximate the ATAN model, which is what PTAM uses. The ATAN model makes
the sensor act sort of like it is spherical, but the math is really bad. In
theory, DTAM would need some corrections to deal with that model directly.
As is, we just undistort the image to make it look like it came from an
ideal planar pinhole camera.

-Paul

On Thu, Jul 2, 2015 at 4:38 AM, avanindra singh [email protected]
wrote:

@anuranbaka https://github.com/anuranbaka

Hi , a bit late reply from me. I had a point when I said the cost layer
should be spherical . It's importance lies in computing the depth
derivatives; in which you assume the inverse depth step to be constant ,
which would be possible only when cost layers are spherical. In case of
planer cost layers , the corners would have different inverse depth jump
than the middle pixel. Correct me if I am wrong.


Reply to this email directly or view it on GitHub
#21 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants