This is a benchmark that performs a series of dense matrix-matrix multiplication. The benchmark simulates one of the main operations that are carried out in lattice QCD application.
The benchmark performs the operation for a number of different implementations. SSE3 and MIC (Intel Xeon Phi) intrinsics are used. For Xeon Phi the program runs in native mode.
The benchmark code was developed under the Cyprus Research Promotion Foundation (RPF) project "GPU Clusterware", Grant Number: ΤΠΕ/ΠΛΗΡΟ/0311(ΒΙΕ)/09