-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trying extrinsics optimization on a grid-based NeRF #142
Comments
After taking a look at the current main branch, I found out that there's a recent CUDA implementation of coordinate gradients computation. I modify the backward function so it would return the However, so far I only get even blurrier renders when I optimize the extrinsics. Has anyone already try something similar and would have some suggestions on what I could be missing in the optimization process? Also I'm not too familiar with Thanks in advance :) |
Hi @LvisRoot ! The 6DoF representation is from Zhou et al. 2019: https://arxiv.org/abs/1812.07035 The kaolin docs further elaborate the difference:
whereas:
You can modify the following example in kaolin to see the difference between the two: The backend representation can be picked with |
I'd actually suspect the hashgrid interpolation by coords. First thing we can do is validate if there is a potential bug here:
|
Hi @orperel , thanks for your answer! I saw the reference to the Matrix6DofRotation paper in I've been trying the following grids lately:
Run a bunch of experiments changing the pose noise strength,
With that pose noise the representation gets pretty cloudy and noisy, but cleans up a lot when pose opt is on. With hash-grids I spent even more time tuning parameters and I always ended up seeing poses diverging, moving into weird positions, getting a super cloudy or very smoothed out representation. Do you have any insight why this could be? Haven't tested with The only thing I found around regarding pose estimation with hash-grids that work was a comment in the Instant-NGP repo, where some details about how it tackles it are explained: Which is not just gradient propagation for the rotation, but cross product the directions gradient and rotate the extrinsics orientation as AFAIU. So that's different to the method I'm currently using. I wonder if adding this would have a big impact in the results. |
I run some more experiments used using the Here are some low-res renders without and with pose-optimization trained for 150 epochs (usually the takes ~400 epochs to get a good PSNR) rgb_pose_noise.mp4rgb_pose_opt.mp4So in my case for the data I'm using (replica dataset) the issue is in the hash-grids. I'll keep going with the triplanar grids for now, but it would be great if someone was able to make pose refinement work with hash-grids and can share some tips on how run it. I'm still interested in using |
@LvisRoot I've started a quick PR to fix this: I still need to test it more before we can merge it, but you're welcome to give it a try meanwhile (don't forget to run |
Hi @orperel , Thanks for following up on this. You're right, the gradients for the coordinates were not returned by the CUDA backend. For my experiments I had changed this from the beginning to test this out (but did not open a PR or anything):
However that didn't fix the pose opt issue. That's why I'm thinking that there's an underlying issue to use the plain coordinate gradients of hash grids for pose optimization 🤔 . Would be great to have some insights on why. I'm wondering if this would also be a case for datasets with all cameras looking at a single object in contrast to Replica where you reconstruct rooms with cameras looking "from the inside". |
Hello, sorry to bother you, I have some questions to ask you about the camera pose optimization in the nerf system: https://github.com/Totoro97/f2-nerf/issues/84, I want to add pose optimization in f2-nerf, but I encountered a similar problem to what you mentioned, can you give me some advice |
Hi @Bin-ze, I ended up not using hash grids as I wasn't able to implement/find an implementation of the gradients that wouldn't blow up. I met other people from NVIDIA 2 months ago at a conference who used their hash grid implementation for pose-opt, but said some work had to be done in order to do pose-opt. I'm not sure if they pushed those changes to their open source repo though. For me planar based grid approaches worked just fine for pose-opt (triplanar, TensoRF (not implemented in wisp but its easy to do from triplanar)). A nice hash based approach that worked for me OOTB por pose-opt was https://github.com/RaduAlexandru/permutohedral_encoding which uses permutoredral grids instead of cubic ones, making it faster (less interpolations) and memory efficient for higher dimensions. In terms of pose representations, both Tangent space Hope this helps. Best, Claucho |
thank you for your reply! I still have some questions to ask:
Best, Bin-ze |
Hi there. First of all, thank you for open sourcing this super useful repo.
I wanted to do pose optimization within a wisp pipeline, leveraging the
kaolin.Camera
class, which is differentiable OOTB.I created a pipeline that transforms rays on each training step with updated extrinsics, but the gradients to the extrinsics parameters weren't propagating properly.
After some debugging, I found that when using a hash grid the CUDA backward implementation of
interpolate
only computes the gradients for thecodebook
parameters.kaolin-wisp/wisp/csrc/ops/hashgrid_interpolate.cpp
Line 96 in cb47e10
I was wondering if it Would it be possible to add the gradient computation for the coordinates as well, since it would be a great enhancement to make codebook-based pipelines fully differentiable up to the camera poses.
The text was updated successfully, but these errors were encountered: