-
Notifications
You must be signed in to change notification settings - Fork 25
/
Copy pathreadme.html
199 lines (138 loc) · 10.1 KB
/
readme.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
<html>
<head>
<title>CUDA Cubic B-Spline Interpolation (CI) read me</title>
</head>
<body>
<h1>CUDA Cubic B-Spline Interpolation (CI)</h1>
<h2>Version 1.2</h2>
<p>
This <i>read me</i> serves as a quick guide to using the CUDA Cubic B-Spline Interpolation (abbreviated as CI) code.
The most recent version of CI and some background information can be found <a href="http://www.dannyruijters.nl/cubicinterpolation/">online</a>.
To execute and compile CI you need <a href="http://www.nvidia.com/object/cuda_get.html">CUDA and the CUDA SDK (2.0 or higher)</a>. Read <a href="examples/CUDA5_important_readme.txt">this</a>, if you are using CUDA 5.
This software has been released under a revised BSD style <a href="license.txt">license</a>.
</p>
<h2>Getting started</h2>
<p>
If you want to simply replace linear interpolation by cubic filtering, then all you need to do is to include the appropriate header and replace your <tt>tex1D</tt>, <tt>tex2D</tt> and <tt>tex3D</tt> calls by one of the following:
</p>
<p>
1D textures:<br>
<tt>cubicTex1D(texture tex, float x)</tt><br>
</p>
<p>
2D textures:<br>
<tt>cubicTex2D(texture tex, float x, float y)</tt><br>
<tt>cubicTex2D(texture tex, float2 coord)</tt>
</p>
<p>
3D textures:<br>
<tt>cubicTex3D(texture tex, float x, float y, float z)</tt><br>
<tt>cubicTex3D(texture tex, float3 coord)</tt>
</p>
<p>
Whereby the texture coordinates <tt>coord</tt> are expressed in absolute pixel respectively voxel coordinates (thus not in normalized coordinates).
</p>
<p>
It is also possible to query the cubicly filtered 1st order derivative of the texture at a coordinate <tt>coord</tt>; e.g. in the x-direction:
<br>
<tt>cubicTex3D_1st_derivative_x(texture tex, float3 coord)</tt>
</p>
<p>
The calls for the y- and z-direction are similar, and also the derivatives of 1D and 2D textures can be retrieved in the same way. The cost of querying the derivative in a single direction costs the same amount of time as a normal cubicly filtered lookup in the texture. The gradient of the texture at a specified coordinate can be composed by querying the derivatives in x-, y- and z-direction at that location.
</p>
<h2>Pre-filtering</h2>
<p>
When the approach described above is directly applied, it will result in smoothened images. This is caused by the fact that the cubic B-spline filtering yields a function that does not pass through its coefficients (i.e. texture values). In order to wind up with a cubic B-spline interpolated image that passes through the original samples, we need to pre-filter the texture, as is beautifully described by <a href="http://bigwww.epfl.ch/thevenaz/interpolation/">Philippe Thévenaz <i>et al.</i></a>
</p>
<p>
Luckily, CI also provides a CUDA implementation of this prefilter, and using it is rather simple. The interface for the 2D case is:<br>
<tt>CubicBSplinePrefilter2D(float* image, uint pitch, uint width, uint height);</tt>
</p>
<p>
and for the 3D case:<br>
<tt>CubicBSplinePrefilter3D(float* volume, uint pitch, uint width, uint height, uint depth);</tt>
</p>
<p>
The <tt>image</tt> and <tt>volume</tt> should point to the GPU memory containing the data samples. Note that the sample values will be replaced by the cubic B-spline coefficients. The <tt>pitch</tt> variable describes the width of a row in the image in bytes. <tt>width</tt>, <tt>height</tt>, and <tt>depth</tt> describe the extent of the data in pixels/voxels.
Instead of <tt>float*</tt>, it is also possible to pass <tt>float2*</tt>, <tt>float3*</tt>, or <tt>float4*</tt> data (e.g., for RGB color data).
</p>
<h2>Getting your data there</h2>
<p>
In order to make your even easier, CI also provides some routines to transfer your data to the GPU memory, and back. Copying your sample values to the GPU memory can be accomplished by this function:<br>
<tt>cudaPitchedPtr CopyVolumeHostToDevice(const float* host, uint width, uint height, uint depth);</tt><br>
</p>
<p>
The routine allocates GPU memory and copies the sample values from CPU to GPU memory. The pointer to the CPU memory is passed as <tt>host</tt>, and a pointer to the GPU memory is returned. The counterpart that copies data from the GPU memory to the CPU memory is called:<br>
<tt>void CopyVolumeDeviceToHost(float* host, cudaPitchedPtr device, uint width, uint height, uint depth);</tt>
</p>
<p>
Note that the <tt>host</tt> destination CPU memory should be allocated before <tt>CopyVolumeDeviceToHost</tt> is called. This routine will also free the GPU memory.
</p>
<p>
Often you will have your original data in a different format than <tt>float</tt>, and for large data sets it costs some time (and memory) to cast everything to <tt>float</tt>. Therefore CI also provides a number of functions that copy and cast your data immediately to <tt>float</tt> on the GPU, which is faster and easier. In C++ you can use the following template function:<br>
<tt>template<class T> extern cudaPitchedPtr CastVolumeHostToDevice(const T* host, uint width, uint height, uint depth);</tt>
</p>
<p>
The usage of the parameters is the same as for <tt>CopyVolumeHostToDevice</tt>, except that <tt>host</tt> can be of the type <tt>uchar</tt>, <tt>schar</tt>, <tt>ushort</tt> or <tt>short</tt>. Note that the sample values will be normalized, meaning that the maximum value (e.g. 255 for <tt>uchar</tt>) will be mapped to 1.0.
</p>
<p>
The following function can be used to copy the output of the pre-filter functions into a texture:<br>
<tt>void CreateTextureFromVolume(texture* tex, cudaArray** texArray, const cudaPitchedPtr volume, cudaExtent extent, bool onDevice);</tt>
</p>
<h2>Using layered textures</h2>
<p>
It is possible to use the cubic interpolation texture lookups with layered textures.
In order to do this, you need to add:<br>
<tt>#define cudaTextureType2DLayered</tt><br>
to your code, and replace the calls to:<br>
<tt>tex2DLayered(texture tex, float x, float y, int layer)</tt><br>
with:<br>
<tt>cubicTex2DLayered(texture tex, float x, float y, int layer)</tt><br>
For 1D it works similarly.
</p>
<h2>Example programs</h2>
<p>
Some example programs are provided along with the CI code, to illustrate the usage of the various routines.
In order to compile the example programs, you first need to make sure that the <a href="http://www.nvidia.com/object/cuda_get.html">CUDA SDK examples</a> can be compiled. Then it is simply a matter of opening the Visual Studio solution on Windows machines, or running <tt>make</tt> in the example folder on the Mac or Linux.
<ul>
<li> <b>cudaCubicRotate2D</b>
uses 2D cubic interpolation to apply a user defined rotation and translation to a given image. The code is based on the example program provided by
<a href="http://bigwww.epfl.ch/thevenaz/interpolation/">Philippe Thévenaz</a>, and is adapted to make use of the CI routines. The main program is in C rather than in C++, and therefore shows how to exploit CUDA in general and CI in particular from within C code. When comparing the resulting images with the counterparts generated with the original source code, it becomes immediately apparent that presently CI does not support mirroring periodicity when exceeding the image extent.
<li> <b>cudaCubicTexture3D</b> is an adapted version of the simpleTexture3D that can be found in the CUDA SDK 2.0. It shows how the CI code can be used to pre-filter the texture data, and how to perform the cubic interpolation. This example program also illustrates the cubic interpolation image quality compared to nearest neighbor, linear and non pre-filtered interpolation, since it can switch between those on the fly by pressing the 'f' key.
<li> <b>referenceCubicTexture3D</b>
performs the same task as simpleCubicTexture3D, but uses the CPU instead of the GPU. It is meant to generate a ground truth for performance measurements, and allows you to repeat those measurements on your own hardware.
<li> <b>cudaCubicRayCast</b> is a very simple CUDA raycasting program that demonstrates the merits of cubic interpolation (including prefiltering) in 3D volume rendering.
<li> <b>glCubicRayCast</b> shows raycasting with cubic interpolation using pure OpenGL, without CUDA. However, this example also lacks the prefiltering of the voxel data.
<li> <b>cudaCubicAviPlayback</b> prefilters every frame of an AVI movie on the fly. This example illustrates how the CI code can be used with vector data (such as RGBA color data).
<li> <b>cudaAccuracyTest</b> creates 1D, 2D and 3D prefiltered textures, and uses the cubic interpolation routines to reconstruct the values at the sample knots. The cubicly interpolated values are then compared to the original (randomly generated) sample values.
</ul>
</p>
<h2>Background</h2>
<p>
More background information to the CI code is provided
<a href="http://www.dannyruijters.nl/cubicinterpolation/">online</a>.
A comprehensive discussion of uniform B-spline interpolation and the pre-filter can be found in [1].
The GPU implementation is described in [2].
The fast cubic B-spline interpolation is an adapted version of the method introduced by Sigg and Hadwiger [3].
A description of the adapted algorithm, its merits and its drawbacks is given in [4].
</p>
<p>
<ol>
<li> Philippe Thévenaz, Thierry Blu, and Michael Unser,
"<a href="http://bigwww.epfl.ch/publications/thevenaz0002.pdf">Interpolation Revisited</a>,"
IEEE Transactions on Medical Imaging, vol. 19, no. 7, pp. 739-758, July 2000.<p>
<li> Daniel Ruijters and Philippe Thévenaz,
"<a href="http://dannyruijters.nl/docs/cudaPrefilter3.pdf">GPU Prefilter for Accurate Cubic B-Spline Interpolation</a>,"
The Computer Journal, vol. 55, no. 1, pp. 15-20, January 2012.<p>
<li> Christian Sigg and Markus Hadwiger,
"<a href="http://developer.download.nvidia.com/SDK/9.5/Samples/DEMOS/OpenGL/src/fast_third_order/docs/Gems2_ch20_SDK.pdf">Fast Third-Order Texture Filtering</a>," In GPU Gems 2: Programming Techniques for High-Performance Graphics and General-Purpose Computation, Matt Pharr (ed.), Addison-Wesley; chapter 20, pp. 313-329, 2005.<p>
<li> Daniel Ruijters, Bart M. ter Haar Romeny, and Paul Suetens,
"<a href="http://www.mate.tue.nl/mate/pdfs/10318.pdf">Efficient GPU-Based Texture Interpolation using Uniform B-Splines</a>,"
Journal of Graphics Tools, vol. 13, no. 4, pp. 61-69, 2008.
</ol>
</p>
<p>
Copyright 2008-2013 Danny Ruijters
</p>
</body>
</html>