University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 3
- Qiaosen Chen
- LinkedIn, etc.
- Tested on: Windows 10, i5-9400 @ 2.90GHz 16GB, GeForce RTX 2060 6GB (personal computer).
- Ray sorting by material
- Ideal Diffuse Shading & Bounce
- Perfect Specular Reflection
- Stream Compaction
- Cache first bounce
- Refraction with Frensel effects using Schlick's approximation.
- Physically-based depth-of-field
- Stochastic Sampled Antialiasing
- Arbitrary mesh loading and rendering glTF files with toggleable bounding volume intersection culling
- Better hemisphere sampling methods
- Direct lighting
- Motion blur by averaging samples at different times in the animation (Extra Credit)
-
Specular surface reflection & Diffuse surface reflection & Refraction
Specular Surface Diffuse Surface Refraction -
glTF mesh loading
Venus Spear Bearer Sparta -
Physically-based depth-of-field (Focal Distance = 10)
No Depth-of-fieldLens Radius With Depth-of-fieldLens Radius -
Stochastic Sampled Antialiasing
With Anti-Aliasing Without Anti-Aliasing -
Stratified sampling method
Random sampling Stratified sampling It's hard to find out some obvious differences between the naive random sampling and the stratified sampling.
-
Direct Lighting
Indirect Lighting Direct Lighting -
Motion Blur
Without Motion Blur With Motion Blur
I used the scene, a Cornell box with a specular sphere, to do the following test. With the function cudaEventElapsedTime()
provided by cuda_runtime_api.h, I could compute how long GPU takes to do pathtracing iterations by creating two cudaEvent_t
variables, iter_event_start
and iter_event_start
, one to record the start time of an iteration and the other to record the end time. After each iteration, I added the running time of this iteration to a variable gpu_time_accumulator
, to accumulate the total time of all the iterations. Finally, I could get the average time of each iteration.
-
Not Sort VS Sort Based on Materials
Not using sorting algorithm gets a much better performance than the one using sorting algorithm. In fact, using
thrust::sort_by_key
to makepathSegments
with the same material are contiguous in memory would takes double time to finish each iteration. In my opinion, this result may be due to that very few materials are used in the scene. Only 5 materials are used at this test, so sortingpathSegments
according to their materials consumes more resources and takes more time. -
Not Cache VS First Bounce Cache
As we can see, caching the data computed in the first bounce for later iterations achieves a better performance, although it only takes a very few milliseconds less than the one which doesn't store the first bounce. Besides, it's obvious that the running time of each iteration increases as the depth increases.
-
Not Bounding Box VS Bounding Box Intersection Culling
For the Venus scene, apparently the situation with bounding box intersection culling intersection outperforms the situation without bounding box.
Strange direct lighting | Incorrect mesh surface normal calculation |
---|---|