Optimizing MLP Kernels

This project involves implementing various versions of a Multilayer Perceptron (MLP) kernel to improve execution time while maintaining accuracy.

Project Structure

resources/
scripts/
- infer.sh
- profile.sh
- pytorch_train.sh
src/
- application_training/
  - train.py
- kernels/
  - __init__.py
  - improved_kernel.py
  - naive_kernel.py
  - tensor_kernel.py
- model_weights/
  - model.pt
- results/
  - batch_times_mem.png
  - batch_times_proc.png
  - report_1024.ncu-rep
  - report.ncu-rep
  - time_taken_hidden_proc.png
  - time_taken_hidden.png
- analysis.py
- config.yaml
- inference.py
- model.py
- profiler.py
- utils.py
.gitignore
ReadME.md

Description

The goal is to optimize different implementations of MLP kernels to achieve better execution times without sacrificing accuracy.

Directories and Files

scripts/: Contains shell scripts for training, inference, and profiling.
- pytorch_train.sh: Script to train the pytorch MLP model to test performance.
- infer.sh: Script to perform inference using the trained model.
- profile.sh: Script to profile the MLP kernels.
src/: Source code and related files.
- application_training/: Training application scripts.
  - train.py: Training script for the pytorch MLP model to test performance.
- kernels/: Different MLP kernel implementations.
  - naive_kernel.py: Basic MLP kernel implementation.
  - improved_kernel.py: Optimized MLP kernel.
  - tensor_kernel.py: MLP kernel using tensor operations.
- model_weights/: Contains saved model weights.
  - model.pt: Trained model file.
- results/: Profiling reports and result images.
  - batch_times_mem.png, batch_times_proc.png: Batch time analysis plots.
  - report.ncu-rep, report_1024.ncu-rep: Profiling reports.
  - time_taken_hidden.png, time_taken_hidden_proc.png: Hidden layer time analysis plots.
- analysis.py: Script for analyzing results and plotting graphs.
- config.yaml: Configuration file for the project.
- inference.py: Script to run inference.
- model.py: Defines the MLP model architecture.
- profiler.py: Profiling utilities.
- utils.py: Helper functions.
tensor_kernel_correctness_check: Verifies that the output of tensor_kernel matches actual numpy implementation.

Getting Started

Prerequisites

Python 3.x
PyTorch
CUDA Toolkit (if using GPU acceleration)

Installation

Clone the repository and navigate to the project directory:

git clone https://github.com/yourusername/e4750-2024fall-project-nerf.git
cd e4750-2024fall-project-nerf

Install the required Python packages:

pip install -r requirements.txt

Running the Project

Training:
```
bash scripts/pytorch_train.sh
```
Inference:
```
bash scripts/infer.sh
```
Profiling:
```
bash scripts/profile.sh
```

Results

Results and analysis can be found in the src/results/ directory.

License

This project is part of the EECSE4750 course.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github		.github
resources/prev_code		resources/prev_code
scripts		scripts
src		src
.gitignore		.gitignore
ReadME.md		ReadME.md
presentation.pdf		presentation.pdf
report.pdf		report.pdf
requirements.txt		requirements.txt
tensor_kernel_correctness_check.ipynb		tensor_kernel_correctness_check.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Optimizing MLP Kernels

Project Structure

Description

Directories and Files

Getting Started

Prerequisites

Installation

Running the Project

Results

License

About

Releases

Packages

Languages

likhithayinala/Highly-Optimized-NN-using-PyCUDA

Folders and files

Latest commit

History

Repository files navigation

Optimizing MLP Kernels

Project Structure

Description

Directories and Files

Getting Started

Prerequisites

Installation

Running the Project

Results

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages