CUDA micro-benchmarking tool (CUMB) Antti-Pekka Hynninen (2016) Measure: -Cache line size -Memory latency, including base latency and departure delay