-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about Gaussian normalization in the paper and alpha blending implementation in the code #294
Comments
Dear @grgkopanas, Could you, please, take a look on that question? Many thanks in advance! |
We have our best guy looking at it :) Its indeed an interesting observation |
Normalization and alpha play the same role in the equations, so you can think of alpha as "normalization*the_real_alpha". I actually prefer not having the normalization term (as it is now), because the Gaussians are not the result of blurring a Dirac: I see them as a "mass of stuff". If there was the normalization term, large Gaussian would have to have an alpha value way larger than 1, which makes little sense. I prefer to see the Gaussians as blobs, where alpha is the transparency at the center. |
I am not sure that the normalization term that @KaziiBotashev refers to be can be absorbed by the alpha parameter (that would indeed be very convenient in terms of implementation simplicity). The reason for this, unless I am wrong, is that the opacity of the volume is independent of the camera view, while the normalization term directly depends on it through the Jacobian J_k which internally involves the camera rotation and translation. @grgkopanas do you have any updates that you could share with us about this issue? |
@f-dy If there is the normalization term, large Gaussian would have to have an normalization term value way larger than 1. That effect might be compensated by "the real alpha" learned value (for large gaussians we will have large normalization term and small real alpha value trained). Can you, please, elaborate a bit more on why it makes little sense? |
regarding the normalisation of the gaussian based on the det(covariance): to my understanding, mathematically it makes no difference, it can be baked into alpha. numerically, it might make a difference, i don't know. performance wise it's faster to not compute the normalisation. but that's done in the preprocess phase, so it should not really matter. regarding the normalisation based on the jakobian: |
OK I found one place where normalization consideration is missing, this is where the 2D Gaussian is convolved with an isotropic 2D Gaussian of sigma sqrt(0.3) to simulate pixel integration (this is not in the paper): Let us say you have a 2D Gaussian with an opacity of 1 at the center. When doing a convolution with another 2D Gaussian, if the opacity is currently left unchanged the Gaussian will become larger while remaining opaque and may obscure Gaussians that are behind (we observed that on grid patterns). Take an extreme case where the original Gaussian has size 0.1 and opacity 1, and we blur it with a Gaussian of sigma 10. The result is a Gaussian with sigma=sqrt(0.1^2+10^2), but the opacity shouldn't be 1! Instead, the opacity should be reduced so that the integrated opacity of the resulting Gaussian is the same as the original one. Thus in the 3DGS code the opacity should be multiplied by the factor sqrt(det(Sigma)/det(Sigma+diag(0.3,0.3))). @grgkopanas In the above example, the factor (and thus the final opacity at the center of the 2D Gaussian) would be sqrt(0.1^4/10.1^4) = 0.0001. |
That is entirely correct! We discovered this some time ago. We tested it with and without proper compensation, but we found it has no measurable impact on image quality according to standard metrics. So we left it the same way it was used for the paper evaluation. Hth, |
The impact on standard metrics may be small because the validation images are selected to render at similar distance from training images. If you captured data at distance 0.5m and render them at 2m, or using different focal length, the effect could be obvious. |
+1 we've seen a very visible impact when rendering from a different distance |
Hi Bernhard @Snosixtyboo, would it be possible to have that at least as an option? I'm not a CUDA expert, and find it difficult to compute a scalar here and use it somewhere else, but since you did it before, could you share the solution? |
Here are two videos for the demonstrated effect as mentioned by @f-dy The first video was just do 2D convolution without compensation of opacity, when render camera moves from far to close (near the captured distance), we observe the color on the grid pattern of acoustic amplifier changes and creates aliasing like effect (though it is not aliasing). The second video was 2D convolution with compensation of opacity. The demonstration was done using a third party implementation (https://github.com/wanmeihuali/taichi_3d_gaussian_splatting/tree/main/taichi_3d_gaussian_splatting). aliasing_like.mp4no_aliasing_like.mp4 |
Another question about the alpha, I notice that the alpha formulation in equation (2) in the paper and the implementation in the code are different. It is 1 - exp in the paper but exp in the code. Can anyone explain the reasons? @KaziiBotashev @Snosixtyboo @grgkopanas |
@tdzdog I believe the typo is not in the code, but in the paper, where in the Eq. 2 it should be |
@tdzdog @ys-koshelev Eq. 2 in the paper is definitely correct, |
Dear authors, thank you for this outstanding work.
I have some questions related to the alpha blending implementation in the code.
In the lines 336-359 of forward.cu , we do alpha blending with the following procedure:
Following EWA splatting paper the final C[ch] is equivalent to this (ommiting low-pass filter):
with following:
and following:
It seems to me that in order to compute the final color value, we also need to multiply it with the normalization factor, which is the multiplication of the determinants of the Jacobian, camera rotation (the rotation one is identity because of orthonormality), and the square root of the covariance matrix . If I do this, I will get just the square root of the Vk (world reference frame) matrix.
However, in the code, I can't find any of these determinants or related multiplications either in forward or backward processes, we only use exponential part without normalization and it confuses me a lot. Jacobian is not a constant value; it actually depends on the positions (3D means) of our gaussians, so we can't just simply omit it as well as det(Vk), which is our direct optimization parameter.
I would be very grateful if you could clarify either where we do that part or why we don't need to do it.
Thank you in advance!
The text was updated successfully, but these errors were encountered: