Winter 2016
Course: Programming-Massively-Parallel-Processors-with-CUDA, Northwestern University, Evanston, IL
The lab’s focus is on producing correct code. This project reinforces the acquisition of basic GPU/CUDA programming skills, the software interface, and the basic architecture of the device.
This lab focuses on data layout and decomposition, and full utilization of shared memory resources and global bandwidth through bank conflict avoidance and memory coalescing.
In this lab you are called to define optimization goals and strategy, implement them, and keep a research lab journal on which you report statistics and analyze every optimization you tried, even ones that did not work or degraded performance. For this assignment you will need to read recent research papers that outline some of the best-known ways to solve this problem.
This lab focuses on the application of efficient parallel algorithms that utilize shared memory and synchronization and minimize path divergence.