This project involves implementing various versions of a Multilayer Perceptron (MLP) kernel to improve execution time while maintaining accuracy.
resources/
scripts/
infer.sh
profile.sh
pytorch_train.sh
src/
application_training/
train.py
kernels/
__init__.py
improved_kernel.py
naive_kernel.py
tensor_kernel.py
model_weights/
model.pt
results/
batch_times_mem.png
batch_times_proc.png
report_1024.ncu-rep
report.ncu-rep
time_taken_hidden_proc.png
time_taken_hidden.png
analysis.py
config.yaml
inference.py
model.py
profiler.py
utils.py
.gitignore
ReadME.md
The goal is to optimize different implementations of MLP kernels to achieve better execution times without sacrificing accuracy.
scripts/
: Contains shell scripts for training, inference, and profiling.pytorch_train.sh
: Script to train the pytorch MLP model to test performance.infer.sh
: Script to perform inference using the trained model.profile.sh
: Script to profile the MLP kernels.
src/
: Source code and related files.application_training/
: Training application scripts.train.py
: Training script for the pytorch MLP model to test performance.
kernels/
: Different MLP kernel implementations.naive_kernel.py
: Basic MLP kernel implementation.improved_kernel.py
: Optimized MLP kernel.tensor_kernel.py
: MLP kernel using tensor operations.
model_weights/
: Contains saved model weights.model.pt
: Trained model file.
results/
: Profiling reports and result images.batch_times_mem.png
,batch_times_proc.png
: Batch time analysis plots.report.ncu-rep
,report_1024.ncu-rep
: Profiling reports.time_taken_hidden.png
,time_taken_hidden_proc.png
: Hidden layer time analysis plots.
analysis.py
: Script for analyzing results and plotting graphs.config.yaml
: Configuration file for the project.inference.py
: Script to run inference.model.py
: Defines the MLP model architecture.profiler.py
: Profiling utilities.utils.py
: Helper functions.
tensor_kernel_correctness_check
: Verifies that the output of tensor_kernel matches actual numpy implementation.
- Python 3.x
- PyTorch
- CUDA Toolkit (if using GPU acceleration)
Clone the repository and navigate to the project directory:
git clone https://github.com/yourusername/e4750-2024fall-project-nerf.git
cd e4750-2024fall-project-nerf
Install the required Python packages:
pip install -r requirements.txt
-
Training:
bash scripts/pytorch_train.sh
-
Inference:
bash scripts/infer.sh
-
Profiling:
bash scripts/profile.sh
Results and analysis can be found in the src/results/
directory.
This project is part of the EECSE4750 course.