NOTE: These results are still very much in flux as CXXRTL is under heavy development!
Check out the "CXXRTL, A Yosys Simulation Backend" article on my blog!
This project compares the simulation speed of the following open source simulators:
- Icarus Verilog (11.0)
- Verilator (rev 4.033)
- Yosys CXXRTL (version listed with results.)
The test design is a VexRiscv CPU with some RAM and some LEDs that are toggling.
I run the simulation for 1M clock cycles, except on Icarus Verilog where I only do 100K. It's just too slow...
(This is optional: the generated Verilog and .bin files are part of the repo.)
cd sw
make
cd ../spinal
make sim
cd tb
make tb
time ./tb
Result (for 100K clock cycles):
...
real 0m26.389s
user 0m26.313s
sys 0m0.061s
Verilator - No Waves
Verilator 4.033 devel rev v4.032-73-gdef40fa
real 0m0.456s
user 0m0.452s
sys 0m0.004s
real 0m0.456s
user 0m0.456s
sys 0m0.000s
real 0m0.456s
user 0m0.456s
sys 0m0.000s
Verilator - VCD
Verilator 4.033 devel rev v4.032-73-gdef40fa
real 0m9.381s
user 0m3.371s
sys 0m2.406s
real 0m7.503s
user 0m3.484s
sys 0m2.447s
real 0m7.078s
user 0m3.421s
sys 0m2.521s
CXXRTL - Max Opt - No Waves
Yosys 0.9+2406 (git sha1 334ec5fa, clang 6.0.0-1ubuntu2 -fPIC -Os)
real 0m1.473s
user 0m1.472s
sys 0m0.000s
real 0m1.470s
user 0m1.469s
sys 0m0.000s
real 0m1.472s
user 0m1.467s
sys 0m0.004s
CXXRTL - Max Opt - VCD full (incl Mem)
Yosys 0.9+2406 (git sha1 334ec5fa, clang 6.0.0-1ubuntu2 -fPIC -Os)
real 1m34.634s
user 1m32.743s
sys 0m1.759s
CXXRTL - Max Opt - VCD full (no Mem)
Yosys 0.9+2406 (git sha1 334ec5fa, clang 6.0.0-1ubuntu2 -fPIC -Os)
real 0m9.158s
user 0m7.337s
sys 0m1.170s
CXXRTL - Max Opt - VCD regs only
Yosys 0.9+2406 (git sha1 334ec5fa, clang 6.0.0-1ubuntu2 -fPIC -Os)
real 0m8.517s
user 0m6.740s
sys 0m1.146s
CXXRTL - Max Debug - No Waves
Yosys 0.9+2406 (git sha1 334ec5fa, clang 6.0.0-1ubuntu2 -fPIC -Os)
real 0m2.474s
user 0m2.384s
sys 0m0.008s
real 0m2.382s
user 0m2.381s
sys 0m0.000s
real 0m2.373s
user 0m2.371s
sys 0m0.000s
CXXRTL - Max Debug - VCD full (incl Mem)
Yosys 0.9+2406 (git sha1 334ec5fa, clang 6.0.0-1ubuntu2 -fPIC -Os)
real 2m3.533s
user 1m58.238s
sys 0m4.685s
CXXRTL - Max Debug - VCD full (no Mem)
Yosys 0.9+2406 (git sha1 334ec5fa, clang 6.0.0-1ubuntu2 -fPIC -Os)
real 0m39.661s
user 0m33.152s
sys 0m5.129s
CXXRTL - Max Debug - VCD regs only
Yosys 0.9+2406 (git sha1 334ec5fa, clang 6.0.0-1ubuntu2 -fPIC -Os)
real 0m10.470s
user 0m7.970s
sys 0m1.659s
CXXRTL - Max Opt - clang9
Yosys 0.9+2406 (git sha1 334ec5fa, clang 6.0.0-1ubuntu2 -fPIC -Os)
real 0m1.488s
user 0m1.474s
sys 0m0.012s
real 0m1.473s
user 0m1.472s
sys 0m0.000s
real 0m1.461s
user 0m1.457s
sys 0m0.004s
CXXRTL - Max Opt - clang6
Yosys 0.9+2406 (git sha1 334ec5fa, clang 6.0.0-1ubuntu2 -fPIC -Os)
real 0m1.455s
user 0m1.444s
sys 0m0.004s
real 0m1.450s
user 0m1.445s
sys 0m0.004s
real 0m1.447s
user 0m1.446s
sys 0m0.000s
CXXRTL - Max Opt - gcc10.1
Yosys 0.9+2406 (git sha1 334ec5fa, clang 6.0.0-1ubuntu2 -fPIC -Os)
real 0m1.736s
user 0m1.729s
sys 0m0.000s
real 0m1.726s
user 0m1.717s
sys 0m0.008s
real 0m1.727s
user 0m1.726s
sys 0m0.000s
CXXRTL - Max Opt - gcc7.5
Yosys 0.9+2406 (git sha1 334ec5fa, clang 6.0.0-1ubuntu2 -fPIC -Os)
real 0m1.688s
user 0m1.678s
sys 0m0.004s
real 0m1.678s
user 0m1.677s
sys 0m0.000s
real 0m1.674s
user 0m1.673s
sys 0m0.000s
At the time of writing this, the cxxrtl optimization recipe was as follows:
read_verilog ../spinal/ExampleTop.sim.v
hierarchy -check -top ExampleTop
write_ilang ExampleTop.sim.ilang
write_cxxrtl ExampleTop.sim.cpp
real 0m3.671s
user 0m3.221s
sys 0m0.138s
clang9 not only gives the best simulation results, but it also compiles must faster than anything else.
Compile time example_default_clang9
real 0m7.321s
user 0m6.976s
sys 0m0.183s
Compile time example_Og_clang9
real 0m9.195s
user 0m9.020s
sys 0m0.142s
Compile time example_default_clang6
real 0m17.038s
user 0m16.562s
sys 0m0.181s
Compile time example_default_gcc10
real 0m32.420s
user 0m31.701s
sys 0m0.497s
Compile time example_default_gcc7
real 0m19.918s
user 0m19.277s
sys 0m0.425s