Skip to content

Experiences on using llvm exegesis

Naoki Shibata edited this page Jul 9, 2019 · 4 revisions

Foreword

In this page, I would like to write my small experiences on using llvm-exegesis.

It is a small tool pertaining to llvm. According to the description page, it is a benchmarking tool for measuring host machine instruction characteristics like latency, throughput, or port decomposition. This is the github project page, and here is a set of slides that looks like the best explanation of the tool.

I would like to share my experiences, since I could not find any report on using this tool.

Building the tool

I mainly use Ubuntu OS for code development. I have computers on which Ubuntu 18.04 is installed. llvm-exegesis is automatically installed when llvm is installed via apt-get. However, it refuses to run saying "LLVM ERROR: cannot initialize libpfm"

So, I decided to build llvm-8.0.0 myself. After building and installing llvm-8.0.0 to the local hard drive, it started working. It runs with root privilege.

# /opt/bin/llvm-exegesis -mode=latency -opcode-name=ADD64rr
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-25e0de.o
---
mode:            latency
key:
  instructions:
    - 'ADD64rr RDX RDX R12'
  config:          ''
  register_initial_values:
    - 'RDX=0x0'
    - 'R12=0x0'
cpu_name:        skylake-avx512
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
  - { key: latency, value: 1.0052, per_snippet_value: 1.0052 }
error:           ''
info:            Repeating a single implicitly serial instruction
assembled_snippet: 415448BA000000000000000049BC00000000000000004C01E24C01E24C01E24C01E24C01E24C01E24C01E24C01E24C01E24C01E24C01E24C01E24C01E24C01E24C01E24C01E2415CC3
...
# echo "vzeroupper" | /opt/bin/llvm-exegesis -mode=uops -snippets-file=-
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-433f2e.o
---
mode:            uops
key:
  instructions:
    - 'VZEROUPPER'
  config:          ''
  register_initial_values: []
cpu_name:        skylake-avx512
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
  - { key: SKXPort0, value: 0.0005, per_snippet_value: 0.0005 }
  - { key: SKXPort1, value: 0.0017, per_snippet_value: 0.0017 }
  - { key: SKXPort2, value: 0.001, per_snippet_value: 0.001 }
  - { key: SKXPort3, value: 0.0016, per_snippet_value: 0.0016 }
  - { key: SKXPort4, value: 0.0013, per_snippet_value: 0.0013 }
  - { key: SKXPort5, value: 0.001, per_snippet_value: 0.001 }
  - { key: SKXPort6, value: 0.0026, per_snippet_value: 0.0026 }
  - { key: SKXPort7, value: 0.0009, per_snippet_value: 0.0009 }
  - { key: NumMicroOps, value: 4.0084, per_snippet_value: 4.0084 }
error:           ''
info:            ''
assembled_snippet: C5F877C5F877C5F877C5F877C5F877C5F877C5F877C5F877C5F877C5F877C5F877C5F877C5F877C5F877C5F877C5F877C3
...
# cat exegesistest1.txt
# LLVM-EXEGESIS-LIVEIN RDI
# LLVM-EXEGESIS-DEFREG XMM1 42
vmulps        (%rdi), %xmm1, %xmm2
vhaddps       %xmm2, %xmm2, %xmm3
addq $0x10, %rdi
# /opt/bin/llvm-exegesis -mode=uops -snippets-file=./exegesistest1.txt
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-6f88de.o
---
mode:            uops
key:
  instructions:
    - 'VMULPSrm XMM2 XMM1 RDI i_0x1  i_0x0 '
    - 'VHADDPSrr XMM3 XMM2 XMM2'
    - 'ADD64ri8 RDI RDI i_0x10'
  config:          ''
  register_initial_values:
    - 'XMM1=0x42'
cpu_name:        skylake-avx512
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
  - { key: SKXPort0, value: 0.4612, per_snippet_value: 1.3836 }
  - { key: SKXPort1, value: 0.4224, per_snippet_value: 1.2672 }
  - { key: SKXPort2, value: 0.169, per_snippet_value: 0.507 }
  - { key: SKXPort3, value: 0.1682, per_snippet_value: 0.5046 }
  - { key: SKXPort4, value: 0.0017, per_snippet_value: 0.0051 }
  - { key: SKXPort5, value: 0.6674, per_snippet_value: 2.0022 }
  - { key: SKXPort6, value: 0.3366, per_snippet_value: 1.0098 }
  - { key: SKXPort7, value: 0.001, per_snippet_value: 0.003 }
  - { key: NumMicroOps, value: 1.6761, per_snippet_value: 5.0283 }
error:           ''
info:            ''
assembled_snippet: 4883EC10C7042442000000C744240400000000C744240800000000C744240C0000000062F17E086F0C244883C410C5F05917C5EB7CDA4883C710C5F05917C5EB7CDA4883C710C5F05917C5EB7CDA4883C710C5F05917C5EB7CDA4883C710C5F05917C5EB7CDA4883C710C5F05917C3
...

First analysis

So far so good. Now I want to try analyzing more practical code.