This repository contains the codes related to the paper
Paolo Gorlani, Christian Plessl, High Level Synthesis Implementation of a Three-dimensional Systolic Array Architecture for Matrix Multiplications on Intel Stratix 10 FPGAs. (http://arxiv.org/abs/2110.11521)
fpga-src
contains the FPGA codes.host-src
andhost-includes
contain the host codes.design-*
containsconfig.h
the design configuration file,aocr_report.tar.gz
the report generated by the creation of the.aocr
,acl_quartus_report_s<seed>.txt
the report containing the fmax obtained by the design synthesis,- or
fitter-failed-output-s<seed>
the output in case of failed fitting, multiplication-output-<dim2i>-<dim2j>-<dim2k>
the output of a matrix multiplication for the best fmax design.
NOTE: set PLATFORM_ID
and DEVICE_ID
in host-src/host.cpp
in order to target your FPGA accelerator.
$ # enter the desired design directory
$ cd design-<id>
$ # generate krnl_systo.aocr
$ make aocr
... (can take several hours)
$ # generate the aocx file having the seed that gave the best fmax in our system (see below)
$ make aocx
$ # or generate the aocx having <seed> as seed
$ make aocx SEED=<seed>
... (can take several hours)
$ # compile the host code
$ make host
$ # using the design for multipling two matrices (dim2i,dim2k), (dim2k,dim2j)
$ ./host <dim2i> <dim2j> <dim2k> krnl_systo-s<seed>.aocx
The fmax results in the paper/repository are obtained with a BittWare 520N Stratix 10 accelerator card and the following software versions.
$ uname --all
Linux fpga-0002 3.10.0-1160.15.2.el7.x86_64 #1 SMP Wed Feb 3 15:06:38 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
$ quartus_cmd -v
Quartus Prime
Version 19.4.0 Build 64 12/04/2019 SC Pro Edition
Copyright (C) 2019 Intel Corporation. All rights reserved.
$ aoc -version
Intel(R) FPGA SDK for OpenCL(TM), 64-Bit Offline Compiler
Version 20.4.0 Build 72 Pro Edition
Copyright (C) 2020 Intel Corporation
$ aocl version
aocl 20.4.0.72 (Intel(R) FPGA SDK for OpenCL(TM), Version 20.4.0 Build 72 Pro Edition, Copyright (c) 2020 Intel Corporation)