torchPACE

PyTorch C++ and CUDA extension for PACE's Piecewise Polynomial Approximation(PwPA), a Transformer non-linerarities accelaration engine.

Introduction

This extension integrates PwPA CUDA kernels for both AoS and SoA coefficients' data structure using a simple unrolling technic.
More details here.

Setup

Built with PyPA/Build, but you can use Pip or similar.

To build:

python -m build -n

To install:

pip install dist\<builded_extension_file.whl>

To test:

python test\extension_test.py

python test\extension_test.py

To use:

import torch_pace
...
# base kernel
y = torch_pace.ops._pwpa(x, coeffs, partition_points, AoS=true)
# optimized kernel
y = torch_pace.ops.pwpa(x, coeffs, partition_points, AoS=true)
# AoS to SoA coefficients rearrangement
coeffs_soa = torch_pace.ops.aos2soa(coeffs, degree)
# optimized kernel with SoA coefficients' data structure
y = torch_pace.ops.pwpa(x, coeffs_soa, partition_points, AoS=false)

Important

Requirements:

torch>=2.4 with CUDA enabled (mine is 2.5.1+cu118)
CUDA toolkit (mine is 11.7)
Python>=3.8 (mine is 3.12.8)

Examples

This is the ouput of running approximation_test.py:

Note

approximation_test.py uses a simple uniform partitioning which divides the X-value range in equal parts.
More sophisticated partitioning strategies may account for slope trends, yielding more accurate approximations where the function changes more.

ToDo

A brief list of things to do or fix in this extension:

PyTorch Half type support
Extension Benchmark on non-linearities in plain CUDA code
Extension Benchmark on PyTorch non-linearities
ILP (Instruction-Level Parallelism) integration
aos2soa function
soa2aos function
CUDA SIMD instrics analysis for float16 (PyTorch Half) type
PyTorch neural net example

Credits

Extension backbone inspired by this tutorial.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
extra		extra
test		test
torch_pace		torch_pace
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

torchPACE

Introduction

Setup

Examples

ToDo

Credits

Authors

About

Releases

Packages

Languages

License

sangioai/torchPACE

Folders and files

Latest commit

History

Repository files navigation

torchPACE

Introduction

Setup

Examples

ToDo

Credits

Authors

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages