Skip to content

giaosame/Project3-CUDA-Path-Tracer

 
 

Repository files navigation

University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 3

  • Qiaosen Chen
  • Tested on: Windows 10, i5-9400 @ 2.90GHz 16GB, GeForce RTX 2060 6GB (personal computer).

Summary

pathtracing first demo gif

Features

  • Ray sorting by material
  • Ideal Diffuse Shading & Bounce
  • Perfect Specular Reflection
  • Stream Compaction
  • Cache first bounce
  • Refraction with Frensel effects using Schlick's approximation.
  • Physically-based depth-of-field
  • Stochastic Sampled Antialiasing
  • Arbitrary mesh loading and rendering glTF files with toggleable bounding volume intersection culling
  • Better hemisphere sampling methods
  • Direct lighting
  • Motion blur by averaging samples at different times in the animation (Extra Credit)

Rendered Images

  • Specular surface reflection & Diffuse surface reflection & Refraction

    Specular Surface Diffuse Surface Refraction
  • glTF mesh loading

    Venus Spear Bearer Sparta
  • Physically-based depth-of-field (Focal Distance = 10)

    No Depth-of-fieldLens Radius With Depth-of-fieldLens Radius
  • Stochastic Sampled Antialiasing

    With Anti-Aliasing Without Anti-Aliasing
  • Stratified sampling method

    Random sampling Stratified sampling

    It's hard to find out some obvious differences between the naive random sampling and the stratified sampling.

  • Direct Lighting

    Indirect Lighting Direct Lighting
  • Motion Blur

    Without Motion Blur With Motion Blur

Performance Analysis

I used the scene, a Cornell box with a specular sphere, to do the following test. With the function cudaEventElapsedTime() provided by cuda_runtime_api.h, I could compute how long GPU takes to do pathtracing iterations by creating two cudaEvent_t variables, iter_event_start and iter_event_start, one to record the start time of an iteration and the other to record the end time. After each iteration, I added the running time of this iteration to a variable gpu_time_accumulator, to accumulate the total time of all the iterations. Finally, I could get the average time of each iteration.

  • Not Sort VS Sort Based on Materials

    Material Sort VS Not Sort Pic

    Not using sorting algorithm gets a much better performance than the one using sorting algorithm. In fact, using thrust::sort_by_key to make pathSegments with the same material are contiguous in memory would takes double time to finish each iteration. In my opinion, this result may be due to that very few materials are used in the scene. Only 5 materials are used at this test, so sorting pathSegments according to their materials consumes more resources and takes more time.

  • Not Cache VS First Bounce Cache

    Cache VS Not Cache Pic

    As we can see, caching the data computed in the first bounce for later iterations achieves a better performance, although it only takes a very few milliseconds less than the one which doesn't store the first bounce. Besides, it's obvious that the running time of each iteration increases as the depth increases.

  • Not Bounding Box VS Bounding Box Intersection Culling

    For the Venus scene, apparently the situation with bounding box intersection culling intersection outperforms the situation without bounding box.

    Not Bounding Box VS Bounding Box Intersection Culling

Bloopers

Strange direct lighting Incorrect mesh surface normal calculation

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 55.0%
  • Cuda 20.2%
  • CMake 13.8%
  • C 10.2%
  • Makefile 0.8%