-
Notifications
You must be signed in to change notification settings - Fork 9
Developer Info
The general layout of the source code is based on the following directories:
-
Math
: Linear algebra math classes as well as sampling functions and function integrators -
Base
: General purpose classes like timing, fixed size strings, high performance file streams and random number generators. CUDA memory manager, base classes for pseudo virtual classes in CUDA -
Engine
: All basic components a rendering algorithm needs such as a Scene, texture and mesh loading. -
Kernel
: Ray tracing operations, buffer management on CPU and GPU, the ImagePipeline to tonemap and filter accumulated images and also all PixelDebugVisualizers to output debug information on pixels. -
Integrators
: All rendering algorithms, some ported to "Wavefront Path Tracing" for efficiency on the GPU -
SceneTypes
: BSDFs, Emitters, Sensors, Image filters, Textures all polymorphic types present in a Monte Carlo based Path Tracing library.
In the host side overview shown above it is clearly visible that polymorphism would be a useful concept for such an implementation. Due to technical necessities CUDA does not support creating virtual classes on the host and copying them to the device. To circumvent this issue a small helper class CudaVirtualAggregate
is used to store classes with virtual functions. For the user of the library this adds the small inconvenience that it is not possible to create new objects like the following:
new PerspectiveSensor(fov)
Instead of that, one has to use:
CreateAggregate<Sensor>(PerspectiveSensor(fov))
Here the template argument specifies the base class of the type we would like to construct.
On the host one needs to initialize/deinitialize the static parts of the library (FreeImage, SpectrumHelper and RoughTransmittanceManager), in the following way:
InitializeCuda4Tracer("path to ior/microfacet folder");
...
DeInitializeCuda4Tracer();
Creating a scene is done like this:
Sensor camera = CreateAggregate<Sensor>(PerspectiveSensor(width, height, fov));
DynamicScene scene(&camera, SceneInitData::CreateForScene(10, 10, 1000), &fManager);
There are some parts of the library which do not need a scene object but still need the static initialization!
The SceneInitData
object describes the size of the scene about to be created, so enough storage is allocated on the GPU. The pointer to fManager
is an object which implements IFileManager
telling the scene where to store temporary compiled mesh objects and textures.
Before tracing any rays the kernel module has to be initialized with:
void UpdateKernel(DynamicScene* a_Scene);
Now traceRay
can be used to trace rays through the scene and obtain TraceResults. The diagram above shows what can be done with such a result. DifferentialGeometry describes the geometry at the intersection, and BSDFSamplingRecord is necessary to sample BSDF
objects. Doing algorithmic development the class KernelDynamicScene
is helpful, it provides all sampling strategies which commonly occur during Monte Carlo ray tracing.
It is easily possible to implement custom integrators by deriving from
template<bool USE_BLOCKSAMPLER, bool PROGRESSIVE> class Tracer
The first argument specifies whether an image space sampler should be used to sample unconverged pixels more often. The second parameter specifies whether the tracer will progressively sample a frame until convergence or generate a new frame each time it is called. In case the integrator can be implemented in terms of sampling pixels, it is sufficient to override
virtual void RenderBlock(Image* I, int x, int y, int blockW, int blockH);
This method should add pixel samples for the specified area. Other integrators such as a photon tracer have to override
virtual void DoRender(Image* I);
and sample all pixels. In these methods you can either use the GPU to compute pixel samples by launching CUDA kernels or use the CPU. Due to the design it is possible to make large parts of the code independent of whether it is run on the device or host. This will help during debugging as you can just implement
void DebugInternal(Image* I, const Vec2i& p);
and use the host debugger to figure out what is going on.
A complete example of how to use the library can be found in the wiki.
No longer possible due to changes in CUDA 8.0
Note of caution: On Windows one must link against ALL separate *.cu.obj
object files from the original library. Without doing that, the CUDA linker will not be able to link device functions.
Here are some notes about the code and its issues:
- Here it is stated that CUDA does not allow using an object of a class derived from virtual base classes in a device function. This has probably to do with the memory layout of objects with virtual functions. This library currently operates on the assumption that doing so is acceptable as long as no virtual functions are used. It will take a considerable amount of time to correctly solve this problem. [Side note: when using multiple inheritance this will no longer work, due to mismatching host/device class layouts]
-
size_t
vsunsigned int
: In some placessize_t
is used while in othersunsigned int
is used. Optimally on the host onlysize_t
should be used andunsigned int
on the device to reduce register pressure. This design was not completely implemented and in some places the host side is working withunsigned ints
. - Memory management is not done consistently. While CUDA_FREE provides some debug info on where the memory is kept, it would have been wiser to use some memory classes to keep track of copies of the same data on the host and the device.
- The Buffer module is not
const
correct, there is no notion ofconst
iterators and all methods return normal references. - Constructors are commonly ignored, e.g. the buffer classes use
malloc
instead ofnew
to avoid making guarantees about using the constructor. The problem here is that due to technical necessities CUDA does not allow constructors on symbolic variables which are used extensively.