FPGA-based-DNN-Accels

Title	LUT/ALM	DSP	Year	Platform	Frequency (MHz)	Throughput (GOPs)	Power (W)	Energie Efficiency (GOPs/W)	Model	W_precision	A_precison
Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks	186251	2240	2015	VC707	100	61.62	18.61	3.31112	AlexNet	FP-32	FP-32
Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks	--	--	2016	P395-D8	120	117.8	19.1	6.16754	VGG-16	INT-8	INT-16
Automatic Code Generation of Convolutional Neural Networks in FPGA Implementation	--	--	2016	VC709	100	222.1	24.8	8.95565	AlexNet	INT-8	INT-16
Going Deeper wth Embedded FPGA Platform for Convolutional Neural Network	182616	780	2016	ZC706	150	136.97	9.63	14.2233	VGG-16	INT-16	INT-16
A High Performance FPGA-based Accelerator for Large-Scale Convolutional Neural Network	--	--	2016	VC709	156	565.94	30.2	18.7397	AlexNet	INT-16	INT-16
Energy-Efficient CNN Implementation on a Deeply Pipelined FPGA Cluster	--	--	2016	ZC706+6*VC709	150	1280.3	160	8.00188	VGG-16	INT-16	INT-16
Energy-Efficient CNN Implementation on a Deeply Pipelined FPGA Cluster	--	--	2016	ZC706+4*VC709	150	825.6	126	6.55238	AlexNet	INT-16	INT-16
Energy-Efficient CNN Implementation on a Deeply Pipelined FPGA Cluster	--	--	2016	ZC706+4*VC709	150	128.8	126	1.02222	AlexNet	INT-16	INT-16
Energy-Efficient CNN Implementation on a Deeply Pipelined FPGA Cluster	--	--	2016	ZC706+VC709	150	290	35	8.28571	VGG-16	INT-16	INT-16
Energy-Efficient CNN Implementation on a Deeply Pipelined FPGA Cluster	--	--	2016	ZC706+VC709	150	203.9	35	5.82571	VGG-16	INT-16	INT-16
CirCNN:Accelerating and Compressing Deep Neural Networks Using Block-Circulant Weight Matrices	--	--	2017	Cyclone V 5CEA9	100	400	0.44	909.091	AlexNet	INT-16	INT-16
F-C3D: FPGA-based 3-Dimensional Convolutional Neural Network	--	--	2017	ZC706	176	144.5	9.7	14.8969	C3D	INT-16	INT-16
A Fully Connected Layer Elimination for a Binarized Convolutional Network on an FPGA	--	--	2017	Zedboard	143	329.47	2.3	143.248	VGG11	Binary	Binary
Accelerating Low Bit-Width Convolutional Neural Networks With Embedded FPGA	--	--	2017	Zynq XC7Z020	200	410.22	2.26	181.513	DoReFa-Net	Binary	INT-2
FP-DNN: An Automated Framework for Mapping Deep Neural Networks onto FPGAs with RTL-HLS Hybrid Templates	42349	1036	2017	Stratix-V GSMD5	150	364.36	25	14.5744	VGG-19	INT-16	INT-16
FP-DNN: An Automated Framework for Mapping Deep Neural Networks onto FPGAs with RTL-HLS Hybrid Templates	164100	264	2017	Stratix-V GSMD5	150	81	25	3.24	VGG-19	FxP-32	FxP-32
FP-DNN: An Automated Framework for Mapping Deep Neural Networks onto FPGAs with RTL-HLS Hybrid Templates	42349	1036	2017	Stratix-V GSMD5	150	315.85	25	12.634	LSTM-LM	INT-16	INT-16
FP-DNN: An Automated Framework for Mapping Deep Neural Networks onto FPGAs with RTL-HLS Hybrid Templates	164100	264	2017	Stratix-V GSMD5	150	86	25	3.44	LSTM-LM	FxP-32	FxP-32
FP-DNN: An Automated Framework for Mapping Deep Neural Networks onto FPGAs with RTL-HLS Hybrid Templates	42349	1036	2017	Stratix-V GSMD5	150	226.47	25	9.0588	ResNet-152	INT-16	INT-16
FP-DNN: An Automated Framework for Mapping Deep Neural Networks onto FPGAs with RTL-HLS Hybrid Templates	164100	264	2017	Stratix-V GSMD5	150	73	25	2.92	ResNet-152	FxP-32	FxP-32
Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs	600000	2520	2017	ZCU102	200	2940.7	23.6	124.606	VGG-16	INT-16	INT-16
Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs	600000	2520	2017	ZCU102	200	854.6	23.6	36.2119	AlexNet	INT-16	INT-16
Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks	161000	1518	2017	Arria 10 GX1150	150	645.25	21.2	30.4363	VGG-16	INT-8	INT-16
Improving the Performance of OpenCL-based FPGA Accelerator for Convolutional Neural Network	--	2756	2017	Arria 10 GX1150	385	1790	37.46	47.7843	VGG-16	INT-16	INT-16
Improving the Performance of OpenCL-based FPGA Accelerator for Convolutional Neural Network	--	1320	2017	Arria 10 GX1150	370	866	41.73	20.7525	VGG-16	FxP-32	FxP-32
ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA	293920	1504	2017	KU060	200	282.2	41	6.88293	LSTM	INT-12	INT-16
Exploring Heterogeneous Algorithms for Accelerating Deep Convolutional Neural Networks on FPGAs	155886	824	2017	ZC706	100	229.5	9.4	24.4149	VGG-16	INT-16	INT-16
An OpenCL Deep Learning Accelerator on Arria 10	246000	1476	2017	Arria 10 GX1150	303	1382	45	30.7111	AlexNet	FxP-16	FxP-16
Fast and Efficient Implementation of Convolutional Neural Networks on FPGA	196370	256	2017	E5-2600+Stratix V	200	229	8.04	28.4826	VGG-16	INT-32	INT-32
Frequency Domain Acceleration of Convolutional Neural Networks on CPU-FPGA Shared Memory System	200522	224	2017	E5-2600+Stratix V	200	123.5	13.18	9.37026	VGG-16	FP-32	FP-32
Frequency Domain Acceleration of Convolutional Neural Networks on CPU-FPGA Shared Memory System	200522	224	2017	E5-2600+Stratix V	200	83	13.18	6.29742	AlexNet	FP-32	FP-32
FPGA-based Accelerator for Long Short-Term Memory Recurrent Neural Networks	198280	1176	2017	VC707	150	7.26	19.63	0.369842	LSTM-RNN	FP-32	FP-32
Maximizing CNN accelerator efficiency through resource partitioning	133854	3494	2017	Xilinx Virtex-7 FPGA 690T	170	909.7	7.2	126.347	SqueezeNet	INT-16	INT-16
Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs	46900	3	2017	Zynq-7000 XC7Z020	143	207.8	4.7	44.2128	BNN	Binary	Binary
A 7.663-TOPS 8.2-W Energy-efficient FPGA Accelerator for Binary Convolutional Neural Networks	342126	1096	2017	Virtex 7	90	7663	8.2	934.512	BNN	Binary	Binary
Algorithm-Hardware Co-Design of Single Shot Detector for Fast Object Detection on FPGAs	175000	1518	2018	Arria 10 GX1150	240	1032	40	25.8	SSD300	INT-8	INT-16
Algorithm-Hardware Co-Design of Single Shot Detector for Fast Object Detection on FPGAs	532000	4363	2018	Arria 10 GX2800	300	2178	100	21.78	SSD300	INT-8	INT-16
VIBNN: Hardware Acceleration of Bayesian Neural Networks	98006	342	2018	Cyclone V	--	127.8	6.1	20.9508	MNIST (RLF-Based)	Binary	Binary
VIBNN: Hardware Acceleration of Bayesian Neural Networks	91126	342	2018	Cyclone V	--	127.8	8.52	15	MNIST (BNNWallace)	Binary	Binary
FBNA: A Fully Binarized Neural Network Accelerator	29600	0	2018	ZC702	--	2236	3.2	698.75	SVHN	Binary	Binary
FBNA: A Fully Binarized Neural Network Accelerator	29600	0	2018	ZC702	--	722	3.3	218.788	CIFAR-10	Binary	Binary
RNA: An Accurate Residual Network Accelerator for Quantized and Reconstructed Deep Neural Networks	203000	0	2018	ZC706	150	687.78	10.56	65.1307	AlexNet	INT-4	INT-8
RNA: An Accurate Residual Network Accelerator for Quantized and Reconstructed Deep Neural Networks	203000	0	2018	ZC706	150	878.11	10.56	83.1544	VGG-16	INT-4	INT-8
RNA: An Accurate Residual Network Accelerator for Quantized and Reconstructed Deep Neural Networks	203000	0	2018	ZC706	150	804.03	10.56	76.1392	ResNet-50	INT-4	INT-8
A Novel Low-Communication Energy Efficient Reconfigurable CNN Acceleration Architecture for Embedded Systems	156859	612	2018	ZC706	150	1249.7	9.82	127.261	VGG-16	INT-8	INT-8
A Novel Low-Communication Energy Efficient Reconfigurable CNN Acceleration Architecture for Embedded Systems	156859	612	2018	ZC706	150	685.6	9.76	70.2459	AlexNet	INT-8	INT-8
A Novel Low-Communication Energy Efficient Reconfigurable CNN Acceleration Architecture for Embedded Systems	156859	612	2018	ZC706	150	507.2	9.72	52.1811	ResNet-50	INT-8	INT-8
A Design Flow of Accelerating Hybrid Extremely Low Bit-width Neural Network in Embedded FPGA	105673	880	2018	ZC706	200	1972	4.2	469.524	AlexNet	hybrid int	INT-4
A Design Flow of Accelerating Hybrid Extremely Low Bit-width Neural Network in Embedded FPGA	103505	550	2018	ZC706	200	1233	4.1	300.732	AlexNet	hybrid int	INT-8
A Design Flow of Accelerating Hybrid Extremely Low Bit-width Neural Network in Embedded FPGA	124317	783	2018	ZC706	200	2530	4.8	527.083	AlexNet	hybrid int	INT-4
Towards a Uniform Template-based Architecture for Accelerating 2D and 3D CNNs on FPGA	242000	1536	2018	VC709	150	430.7	25	17.228	C3D	INT-16	INT-16
Towards a Uniform Template-based Architecture for Accelerating 2D and 3D CNNs on FPGA	209000	1536	2018	VUS440	200	784.7	26	30.1808	C3D	INT-16	INT-16
Angel-Eye: A Complete Design Flow for Mapping CNN onto Embedded FPGA	29867	190	2018	XC7z020	214	84.3	3.5	24.0857	VGG-16	INT-8	INT-8
Angel-Eye: A Complete Design Flow for Mapping CNN onto Embedded FPGA	85172	900	2018	XC7z045	150	137	9.63	14.2264	VGG-16	INT-16	INT-16
Exploration of Low Numeric Precision Deep Learning Inference Using Intel FPGAs	--	--	2018	Arria 10 GX 1150	275	7000	75	93.3333	ResNet-34 1x-wide	FP-32	FP-32
Exploration of Low Numeric Precision Deep Learning Inference Using Intel FPGAs	--	--	2018	Arria 10 GX 1150	275	8000	75	106.667	ResNet-34 1x-wide	INT-8	INT-8
Exploration of Low Numeric Precision Deep Learning Inference Using Intel FPGAs	--	--	2018	Arria 10 GX 1150	275	43000	75	573.333	ResNet-34 1x-wide	Ternary	INT-8
Exploration of Low Numeric Precision Deep Learning Inference Using Intel FPGAs	--	--	2018	Arria 10 GX 1150	275	52000	75	693.333	ResNet-34 1x-wide	Binary	INT-8
Exploration of Low Numeric Precision Deep Learning Inference Using Intel FPGAs	--	--	2018	Arria 10 GX 1150	275	18000	75	240	ResNet-34 1x-wide	INT-4	INT-4
Exploration of Low Numeric Precision Deep Learning Inference Using Intel FPGAs	--	--	2018	Arria 10 GX 1150	275	51000	75	680	ResNet-34 1x-wide	INT-3	INT-3
Exploration of Low Numeric Precision Deep Learning Inference Using Intel FPGAs	--	--	2018	Arria 10 GX 1150	275	85000	75	1133.33	ResNet-34 1x-wide	INT-2	INT-2
Exploration of Low Numeric Precision Deep Learning Inference Using Intel FPGAs	--	--	2018	Arria 10 GX 1150	275	98000	75	1306.67	ResNet-34 1x-wide	Ternary	INT-2
Exploration of Low Numeric Precision Deep Learning Inference Using Intel FPGAs	--	--	2018	Arria 10 GX 1150	275	267000	75	3560	ResNet-34 1x-wide	Binary	Binary
An Asynchronous Energy-Efficient CNN Accelerator with Reconfigurable Architecture	--	--	2018	VC707	--	20.3	0.676	30.0296	LeNet-5	INT-16	INT-16
DeltaRNN: A Power-efficient Recurrent Neural Network Accelerator	277440	2020	2018	Zynq-7100 XC7Z100	125	1198	7.3	164.11	GRU-RNN	INT-16	INT-16
Accelerator Design with Effective Resource Utilization for Binary Convolutional Neural Networks on an FPGA	61000	0	2018	XCVU190	240	3756	5.9	636.61	BNN	Binary	Binary
A PYNQ-based Framework for Rapid CNN Prototyping	--	--	2018	XC7Z020	--	2.56	1.896	1.35021	CNN	INT-8	INT-8
Shortcut Mining: Exploiting Cross-layer Shortcut Reuse in DCNN Accelerators	261096	2800	2019	Virtex-7 485T	150	608.28	21.64	28.1091	ResNet-152	INT-16	INT-16
E-RNN: Design Optimization for Efficient Recurrent Neural Networks in FPGAs	496958	3435	2019	ADM-PCIE-7V3	200	34529	24	1438.71	LSTM on TIMIT (FFT8)	INT-12	INT-12
E-RNN: Design Optimization for Efficient Recurrent Neural Networks in FPGAs	411714	2866	2019	ADM-PCIE-7V3	200	54943	25	2197.72	LSTM on TIMIT (FFT16)	INT-12	INT-12
Synetgy: Algorithm-hardware Co-design for ConvNet Accelerators on Embedded FPGAs	24130	37	2019	Ultra96	250	47.09	5.5	8.56182	DiracDeltaNet	INT-4	INT-4
REQ-YOLO: A Resource-Aware, Efficient Quantization Framework for Object Detection on FPGAs	637671	3456	2019	ADM-PCIE-7V3	200	1967	21	93.6667	YOLO tiny v2	FxP-6	FxP-6
Efficient and Effective Sparse LSTM on FPGA with Bank-Balanced Sparsity	289000	1518	2019	Arria 10 GX1150	200	2432.8	19.1	127.372	LSTM	INT-16	INT-16
Cloud-DNN: An Open Framework for Mapping DNN Models to Cloud FPGAs	1512810	5349	2019	VCU118	214	1828.61	49.25	37.1291	VGG16	INT-16	INT-16
Automatic Compiler Based FPGA Accelerator for CNN Training	208000	1699	2019	Stratix 10 GX	240	163	20.64	7.89729	CIFAR-10; '1X'CNN	INT-16	INT-16
Automatic Compiler Based FPGA Accelerator for CNN Training	415000	3363	2019	Stratix 10 GX	240	282	32.83	8.5897	CIFAR-10; '2X'CNN	INT-16	INT-16
Automatic Compiler Based FPGA Accelerator for CNN Training	720000	5760	2019	Stratix 10 GX	240	479	50.47	9.49079	CIFAR-10; '4X'CNN	INT-16	INT-16
Towards an Efficient Accelerator for DNN-based Remote Sensing Image Segmentation on FPGAs	170906	1665	2019	Intel Arria10 660	200	1578	32	49.3125	U-Net	INT-8	INT-8
An Efficient Hardware Accelerator for Sparse Convolutional Neural Networks on FPGAs	132344	364	2019	ZCU102	200	291	23.6	12.3305	ResNet	INT-16	INT-16
FPGA-Based Sparsity-Aware CNN Accelerator for Noise-Resilient Edge-Level Image Recognition	--	--	2019	Intels Stratix-V	100	57.6	2.03	28.3744	VGG-16	int13	int13
Zac: Towards Automatic Optimization and Deployment of Quantized Deep Neural Networks on Embedded Devices	122500	793	2019	ZC706	166	167.58	6.08	27.5625	VGG16	INT-8	INT-16
Zac: Towards Automatic Optimization and Deployment of Quantized Deep Neural Networks on Embedded Devices	55500	84	2019	ZC706	200	405.82	5.6	72.4679	DoReFa-Net	Binary	INT-2
Zac: Towards Automatic Optimization and Deployment of Quantized Deep Neural Networks on Embedded Devices	51300	31	2019	ZC706	200	441.95	4.88	90.5635	XNOR-Net	Binary	Binary
Zac: Towards Automatic Optimization and Deployment of Quantized Deep Neural Networks on Embedded Devices	100200	818	2019	ZC706	200	124.9	7.31	17.0862	ResNet-18	INT-8	INT-8
Sparse Winograd Convolutional Neural Networks on Small-scale Systolic Arrays	241202	768	2019	V-Ultra XCVU095	150	460.8	8.24	55.9223	VGG16	INT-8	INT-8
Sparse Winograd Convolutional Neural Networks on Small-scale Systolic Arrays	241202	768	2019	V-Ultra XCVU095	150	230.4	4.12	55.9223	VGG16	INT-16	INT-16
Sparse Winograd Convolutional Neural Networks on Small-scale Systolic Arrays	241202	768	2019	V-Ultra XCVU095	150	921.6	16.49	55.8884	VGG16	INT-8	INT-8
A Fine-Grained Sparse Accelerator for Multi-Precision DNN	--	--	2019	Xilinx XCKU115	200	574.2	13.42	42.7869	CNN	INT-4	INT-4
A Fine-Grained Sparse Accelerator for Multi-Precision DNN	--	--	2019	Xilinx XCKU115	200	110.4	13.39	8.24496	RNN	INT-4	INT-4
A Fine-Grained Sparse Accelerator for Multi-Precision DNN	--	--	2019	Xilinx XCKU115	200	571.1	13.41	42.5876	CNN+RNN	INT-4	INT-4
InS-DLA: An In-SSD Deep Learning Accelerator for Near-Data Processing	93232	0	2019	Zynq XC7Z045	100	44.8	9.621	4.65648	CNN	INT-8	INT-8
A 307-fps 351.7-GOPs/W Deep Learning FPGA Accelerator for Real-Time Scene Text Recognition	--	--	2019	Virtex Ultrascale+	100	11973	34.04	351.733	BSEG	Binary	Binary
A High Energy-Efficiency FPGA-Based LSTM Accelerator Architecture Design by Structured Pruning and Normalized Linear Quantization	--	--	2019	Arria 10	150	2220	1.679	1322.22	LSTM	INT-4	INT-8
A 112-765 FPGA-based CNN Accelerator using Importance Map Guided Adaptive Activation Sparsification for Pix2pix Applications	--	--	2020	Zynq XC7Z035	100	2525	3.3	765.152	SRResNet	INT-16	INT-16
NeuroMAX: A High Throughput, Multi-Threaded, Log-Based Accelerator for Convolutional Neural Networks	20600	--	2020	Zynq7020 SoC	200	324	2.72	119.118	VGG16	FP-32	FP-32
When massive GPU parallelism ain\u2019t enough: A Novel Hardware Architecture of 2D-LSTM Neural Network	191449	440	2020	ZCU102	300	5255.66	13.2	398.156	2D-LSTM	Binary	Binary
When massive GPU parallelism ain\u2019t enough: A Novel Hardware Architecture of 2D-LSTM Neural Network	93324	234	2020	ZCU102	240	3071.79	15.47	198.564	2D-LSTM	INT-4	INT-8
Light-OPU: An FPGA-based Overlay Processor for Lightweight Convolutional Neural Networks	173522	704	2020	Xilinx XC7K325T	200	295.68	16.518	17.9005	MobileNetV1	INT-8	INT-8
Light-OPU: An FPGA-based Overlay Processor for Lightweight Convolutional Neural Networks	173522	704	2020	Xilinx XC7K325T	200	197.12	17.14	11.5006	MobileNetV2	INT-8	INT-8
Light-OPU: An FPGA-based Overlay Processor for Lightweight Convolutional Neural Networks	173522	704	2020	Xilinx XC7K325T	200	168.96	17.07	9.89807	MobileNetV3-Large	INT-8	INT-8
Light-OPU: An FPGA-based Overlay Processor for Lightweight Convolutional Neural Networks	173522	704	2020	Xilinx XC7K325T	200	352	6.7	52.5373	DenseNet-161	INT-8	INT-8
Light-OPU: An FPGA-based Overlay Processor for Lightweight Convolutional Neural Networks	173522	704	2020	Xilinx XC7K325T	200	267.52	16.9	15.8296	SqueezeNetV1.1	INT-8	INT-8
End-to-End Optimization of Deep Learning Applications	1111980	3420	2020	VCU1525	242.9	117	10	11.7	OpenPose-V2	FP-32	FP-32
FTDL: An FPGA-tailored Architecture for Deep Learning Applications	--	--	2020	UltraScale	650	1272.22	46.1	27.5969	GoogLeNet, ResNet50	INT-16	INT-16
High-Throughput Convolutional Neural Network on an FPGA by Customized JPEG Compression	274795	2370	2020	VirtexUS+XCVU9P	300	2419.2	75	32.256	CNN	Binary	INT-8
Optimizing Reconfigurable Recurrent Neural Networks	487232	4368	2020	Stratix10 GX2800	260	8015	62.13	129.004	LSTM	INT-8	INT-8
A High Throughput MobileNetV2 FPGA Implementation Based on a Flexible Architecture for Depthwise Separable Convolution	145000	1220	2020	Arria 10	200	693	34	20.3824	MobileNet-V2	INT-16	INT-16
A Reconfigurable Multithreaded Accelerator for Recurrent Neural Network	522852	4368	2020	Stratix 10 2800	260	7810	125	62.48	LSTM	INT-8	INT-8
Memory-Efficient Dataflow Inference Acceleration for Deep CNNs on FPGA	1027000	1611	2020	Alveo U250	195	18300	71	257.746	ResNet-50	Binary	INT-2
FracBNN: Accurate and FPGA-Efficient Binary Neural Networks with Fractional Activations	50656	224	2021	ZYNQ ZU3EG	250	702	6.1	115.082	ReActNet (ImageNet)	Binary	Binary
FracBNN: Accurate and FPGA-Efficient Binary Neural Networks with Fractional Activations	51444	126	2021	ZYNQ ZU3EG	250	401	4.1	97.8049	ReActNet (CIFAR-10)	Binary	Binary
Optimized FPGA-based Deep Learning Accelerator for Sparse CNN using High Bandwidth Memory	334000	1442	2021	Intel Stratix 10 MX2100	257	980.344	79.98	12.2574	MobileNet	FxP-16	FxP-16
Optimized FPGA-based Deep Learning Accelerator for Sparse CNN using High Bandwidth Memory	334000	1442	2021	Intel Stratix 10 MX2100	257	5071.24	79.99	63.3985	ResNet-50	FxP-16	FxP-16
ESCA: Event-Based Split-CNN Architecture with Data-Level Parallelism on UltraScale+ FPGA (short)	469288	2100	2021	Virtex UltraScale+ xcvu9p	320	49.92	10.68	4.67416	VGG16	INT-14	INT-14
3D-VNPU_A Flexible Accelerator for 2D/3D CNNs on FPGA (short)	--	1024	2021	Xilinx ZCU102	200	1353	10.2	132.647	C3D	INT-8	INT-8
3D-VNPU_A Flexible Accelerator for 2D/3D CNNs on FPGA (short)	--	1024	2021	Xilinx ZCU102	200	1150	10.2	112.745	VGG16	INT-8	INT-8
3D-VNPU_A Flexible Accelerator for 2D/3D CNNs on FPGA (short)	--	--	2021	Xilinx ZCU102	200	1210	10.2	118.627	3D RESNET-18	INT-8	INT-8
Eciton: Very Low-Power LSTM Neural Network Accelerator for Predictive Maintenance at the Edge	4987	6	2021	Lattice iCE40 UP5K	17	0.067	0.017	3.94118	LSTM	INT-8	INT-8
FixyFPGA: Efficient FPGA Accelerator for Deep Neural Networks with High Element-Wise Sparsity and without External Memory Access	1078800	1730	2021	Stratix 10 GX 10M FPGA	169.2	3990	28.06	142.195	MobileNet-V1 (1.0)	INT-4	INT-4
FixyFPGA: Efficient FPGA Accelerator for Deep Neural Networks with High Element-Wise Sparsity and without External Memory Access	993800	1730	2021	Stratix 10 GX 10M FPGA	196.89	2650	27.41	96.68	MobileNet-V1 (0.75)	INT-4	INT-4
FixyFPGA: Efficient FPGA Accelerator for Deep Neural Networks with High Element-Wise Sparsity and without External Memory Access	804500	1730	2021	Stratix 10 GX 10M FPGA	200.76	1240	27.08	45.7903	MobileNet-V1 (0.5)	INT-4	INT-4
An FPGA-based MobileNet Accelerator Considering Network Structure Characteristics	308449	2160	2021	Xilinx Virtex-7 XC7V690t	150	181.8	11.35	16.0176	MobileNet	INT-8	INT-8
Leveraging Fine-grained Structured Sparsity for CNN Inference on Systolic Array Architectures	336000	1352	2021	Intel Arria 10 GX1150	242	1662	27.8	59.7842	VGG-16	INT-8	INT-8
Leveraging Fine-grained Structured Sparsity for CNN Inference on Systolic Array Architectures	336000	1352	2021	Intel Arria 10 GX1150	242	495	22.6	21.9027	ResNet-50	INT-8	INT-8

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FPGA-based-DNN-Accels

About

Releases

Packages

AnouarITI/FPGA-based-DNN-Accels

Folders and files

Latest commit

History

Repository files navigation

FPGA-based-DNN-Accels

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages