- YAOYI BAI (pennkey: byaoyi)
- Tested on: Windows 10 Professional, i7-6700HQ @2.60GHz 16GB, GTX 980M 8253MB (My own Dell Alienware R3)
- Browser: Google Chrome
- Resolution: 3840*2160 pixels
DO NOT leave the README to the last minute! It is a crucial part of the project, and we will not be able to grade you without a good README.
This assignment has a considerable amount of performance analysis compared to implementation work. Complete the implementation early to leave time!
All the clusters are linearly divided in the camera space. Which means that the clusters are all trapezoids, or irregular "cubes". Therefore, the intersection test of those clusters are the most importance in the part. The way to implement this test is to do projection to four surfaces of the cluster, they are up, down, left and right surfaces. For near and far surface, we simply compare the z-value of the light we want to check and the near and far surface.
To check for the intersection, there are few steps to do:
- Calculate the z-value of the light lies on;
- Calculate the width and height of the mini frustum slice, and divide them by the number of slices the frustum is divided;
- Calculate the left top value of this cluster, and left bottom, right top and right bottom on the near plane;
- Calculate the normals of left, right, top and bottom surfaces;
- Calculate the projection of the vector from the camera to the light position in the camera space on the surface normals;
- If the projection is larger than 0, check if the length of the projection is larger than the radius of the light, if it is true then the light lies outside the cluster, if not check for the next plane.
After all those checking, we will write the light indexes that lies inside the cluster to the _clusterTexture buffer.
demo: lightNum = 300:
The cluster mechanism of deferred shading are basically the same. The differences between forward+ and deffer shading are the gbuffers pass from vertex shader to fragment shader, which takes less processing time than forward+.
demo: lightNum = 300 (low res version):
Reference from Wikipedia.
demo: deferred shading lightNum = 300:
The core idea here is to transfer from the normal coordinate system into Polar coordinate system, which means that we will only have to store the y-value of the normal and the angle of the projection of the normal on the x-o-z plane that rotated around y-axis from positive x-axis. Therefore, we can compact the data in two buggers:
gl_FragData[0] = vec4(v_position[0],v_position[1],v_position[2],norm[1]);
gl_FragData[1] = vec4(col[0],col[1],col[2],theta);
demo: deferred with Phong lightNum = 300:
The result is actually pretty slow, and I think the reason for the clustered to be super slow would be:
- Intersection checking method could be revised: like for each lights, we check for the planes that "covers" the area, and then loop through the "cover" planes. Then, there will be less slices to check and the checking would be far more faster;
- Use log arithmetics to divide the z slices. Because currently the linear division will result in all of the lights and geometries are located inside the z=0 slices, though we still needs to loop through all the clusters. Therefore, the loop of those z>=2 clusters are super wasting. Here inside this project, I terminated the loops that z>=3. Although this method is tricky, the project runs a lot more faster.
According to the graph above we can see that:
- When we compact the gbuffer, we will have great performance improvement especially when the light amount are relatively high;
- It is not absolute that deferred shading be faster than forward+.
But still, there should be several ways for us to improve the performance of this project according to the analysis above.
- Three.js by @mrdoob and contributors
- stats.js by @mrdoob and contributors
- webgl-debug by Khronos Group Inc.
- glMatrix by @toji and contributors
- minimal-gltf-loader by @shrekshao