This Repo Demonstrates the Methods For running Famous Image Processing Serial (CPU) operations on CUDA Powered GPU
- Stencil : It is an Important Communication pattern Among Threads within a Block of a Grid, Basically it Allows to Reads Input From Fixed Neighborhood in a single location of an Array.
This Repo Contains Stencils Operations and Demonstrates:
Consider applying a 1D stencil to a 1D array of elements. Each output element is the sum of input elements within a radius. If radius is n, then each output element is the sum of n input elements:
Similarly We Create an Apron around Image Tile, So that,
- Image tile can be cached to shared memory
- Each output pixel must have access to neighboring pixels within certain radius R
- This means tiles in shared memory must be expanded with an apron that contains neighboring pixels
- Only pixels within the apron write results.The remaining threads do nothing
Apply some Basic operation such as Thresholding, Image Sharpening and averaging.
Basically, It Loads all the Feature Vector From the File and the Vector to be compaired to the GPU, Calcualtes kullback leibler divergence (a fundamental equation of information theory that quantifies the proximity of two probability distributions) among the vectors On the GPU.
CPU Version is also Provided, within the File.
- Ubunto 16.04 LTS
- GeForce GTX 750 Ti Compute Capability 5.0 with Cuda toolkits and
- Anaconda python 3
- Numba : compiler for Python
- Cuda Python
- Numba
- Manuel Ujaldon Nvidia Cuda Fellow