PULP-NN Mixed is an optimized library which works with sub-byte operands, typically in a scenario in which native operands reach INT8 at least. It is explained in detail in Bruschi et al. [arXiv:2007.07759]. If you intend to use or reference PULP-NN Mixed for an academic publication, please consider citing it:
@inproceedings{10.1145/3387902.3394038,
author = {Bruschi, Nazareno and Garofalo, Angelo and Conti, Francesco and Tagliavini, Giuseppe and Rossi, Davide},
title = {Enabling Mixed-Precision Quantized Neural Networks in Extreme-Edge Devices},
year = {2020},
isbn = {9781450379564},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3387902.3394038},
doi = {10.1145/3387902.3394038},
booktitle = {Proceedings of the 17th ACM International Conference on Computing Frontiers},
pages = {217–220},
numpages = {4},
keywords = {embedded systems, quantized neural network, low power architectures},
location = {Catania, Sicily, Italy},
series = {CF ’20}
}
The library is organized as follow:
- The
32bit
and64bit
directories refer to the precision of the batch normalization parameters; - To use the library the header file under the
include
directory should be inserted in your QNN inference code. They arepulp_nn_kernels.h
andpulp_nn_utils.h
, which contains every kernel and useful function of PULP-NN Mixed library. This directory and the contained files are generated bypulp_nn_kernels_generator.py
; - The directory
src
contains every computational kernel and is generated bypulp_nn_kernels_generator.py
; - The directory
scripts
contains the templates and the useful files to generate the code of every kernel, header and example. - The
test
directory which is generated bypulp_nn_examples_generator.py
and contains a completed setup to run a test with some kernels of the library.
If you want to use the pre-generated src
and include
files you do not need any other installation.
If you want to use the features described above and contained inscripts
directory, are strictly required:
-
python3
-
torch
-
numpy
-
Mako
If you have not done yet, please install them in order to obtain more from PULP-NN such as generate tests for every kernel and modify the templates generating your custom kernels.
In order to use the library in an existing project, you can copy the sources and the headers that are already generated in src and include directories.
If you want to test the library sources, you can generate the whole setup (pulp-sdk based) and golden models using, from directory radix:
> cd scripts
> python3 pulp_nn_examples_generator.py
In order to select the kernels to test, open scripts/setup.py and follow the instructions. You can test either a single kernel per type or all set of kernels per type (pointwise convolution, depthwise convolution, linear with 32-bit of outputs precision and linear with sub-byte of outputs precision)
Then, you can run the simulation on your favorite target architecture using, from directory radix:
> cd test
> make clean all run cores=NUM_CORES kernel=KERNEL platform=PLATFORM
Where, NUM_CORES is the number of cores (by default is set to 1) that you want to use and KERNEL is the precision configuration of the kernel (by default is set to 888 or 88) that you want to test (every permutation is already included).
example: make clean all run cores=8 kernel=888 (and you have selected pointwise in scripts/pulp_nn_examples_generator.py) you will see the results of the 8-bit of inputs, 8-bit of output and 8-bit of weights (in this order) pointwise kernel results, computed in a cluster execution with 8 cores on. Note that, for linear kernels with 32-bit of outputs precision KERNEL can be 88, 84, 82 and so on, for the inputs and weights precision.
You could modify the kernel sources which are been generated or on the templates used for that, which are in scripts/templates. Then, you can regenerate them using, from directory radix:
> cd scripts
> python3 pulp_nn_kernels_generator.py
Firstly, you should clone the repository on your workstation, using:
> git clone https://github.com/pulp-platform/pulp-nn.git
now you have your local copy of the repository.
Then, you should build the sdk (as done in its README
), targeting an architecture and a platform.
For example, if you want to try pulp-open architecture on virtual platform (gvsoc) you should type, from pulp-sdk radix:
> export PULP_RISCV_GCC_TOOLCHAIN = <toolchain_path>
> source configs/pulp.sh
> source configs/platform-gvsoc.sh
> make all
> source pkg/sdk/dev/sourceme.sh
now you can compile and run applications on your favorite platform.
For example, if you want to try convolutional kernels you should modify from scripts/setup.py
the layer parameters such as H, W and channels, the type of kernel, introducing 'pointwise' and then generate test
folder, using, from scripts
folder:
> python3 pulp_nn_examples_generator.py
> cd ../32bit/test
> make clean all run cores=NUM_CORES kernel=KERNEL platform=PLATFORM
as seen above.
In test
folder there are everything as you will need to run the example, headers and sources will be copied and Makefile and main will be generated.
- Nazareno Bruschi, University of Bologna, email
- Angelo Garofalo, University of Bologna, email
- Alessio Burrello, University of Bologna, email
- Francesco Conti, University of Bologna, email
- Giuseppe Tagliavini, University of Bologna, email
- Manuele Rusci, University of Bologna, email
- Davide Rossi, University of Bologna, email
- Some kernels lack in this version compared to
8bit
directory (add, maxpool and avgpool, pointwise); - Tests for
64bit
batch normalization parameters are not supported yet; - Golden models generator is a first version and it could print strange output results (but not wrong). If they do not satisfy your purpose you can tune seed parameter in
golden
function, which its definition is inscripts/test_gen.py
and do again the example generation step; - Channels of input/output feature maps must be multiple of 2 for INT4 precision and of 4 for INT2 one;