An optimized fine-grained implementation of the Particle Swarm Optimization algorithm on GPUs using CUDA programming.
📜Here's the paper with our findings
First, load cuda-12.4 using the command module load cuda-12.4
There are 4 files, one requires a common.h header file. Also, all these files have some default arguments set, so you don't need to pass any arguments.
But if you'd like, I've added flag info and sample arguments you could use.
Compile using GCC - gcc -o seq cpu_pso.c
Sample command - ./seq -m 10000 -d 4 -n 10
- -m (max iterations)
- -c (particle count)
- -d (dimensions)
- -v (verbose, to output argument values while running the code)
- -m (max iterations)
- -c (particle count)
- -d (dimensions)
- -t (threads per block)
- -b (blocks)
- -v (verbose, to output argument values when running the code)
Compile this file via
nvcc -o coarse
Sample command -./coarse -m 10000 -d 4 -t 64 -c 4096 -v 1
Compile this file via nvcc -o finegrained
Sample command - ./finegrained -m 10000 -d 4 -c 4096
- -m (max iterations)
- -c (particle count)
- -d (dimensions)
- -t (threads per block)
- -b (blocks)
- -v (verbose, to output argument values when running the code)
Compile this file via nvcc -o adv
Sample command - ./adv -m 10000 -c 4096 -d 4 -t 4 -b 2048 -v 1
- -m (max iterations)
- -c (particle count)
- -d (dimensions)
- -t (threads per block)
- -b (blocks)
- -v (verbose, to output argument values when running the code)