Skip to content

a quick-and-dirty code that prints some of the "latency numbers every programmer should know"

License

Notifications You must be signed in to change notification settings

jaeheum/numbers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

numbers

2020-11-28

numbers is a quick-and-dirty code that prints some of the "latency numbers every programmer should know" 1, 2, 3, 4.

n.b. it's for linux on x86-64 with c++17 compiler only (gcc 9, 10; clang 10, 11).

$ git clone https://github.com/jaeheum/numbers.git
$ cd numbers
# a quick-and-dirty run:
$ make all && ./build/numbers
g++ -Wall -Wextra -Wpedantic -Ofast --std=c++17  -Iinclude -c src/nanobench.cc -o build/nanobench.o
g++ -Wall -Wextra -Wpedantic -Ofast --std=c++17  -Iinclude -c src/numbers.cc -o build/numbers.o
g++  -o build/numbers build/*.o -lpthread
Warning, results might be unstable:
* CPU governor is 'schedutil' but should be 'performance'

Recommendations
* Use 'pyperf system tune' before benchmarking. See https://github.com/psf/pyperf

|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|               16.50 |       60,604,416.47 |    0.0% |           63.01 |           56.08 |  1.124 |          18.00 |    0.0% |      0.00 | `mutex_access`
|            5,217.33 |          191,668.80 |    0.6% |       12,301.00 |       17,736.67 |  0.694 |       4,097.00 |    0.0% |      0.00 | `L1_random_access`
|          250,328.00 |            3,994.76 |    0.2% |       98,329.00 |      850,748.00 |  0.116 |      32,769.00 |    0.0% |      0.00 | `L2_random_access`
|       52,028,319.00 |               19.22 |    1.5% |    5,898,281.00 |  176,516,236.00 |  0.033 |   1,966,097.00 |    0.0% |      0.60 | `L3_random_access`
|    2,796,066,612.00 |                0.36 |    0.0% |  100,664,178.00 |9,487,319,166.00 |  0.011 |  33,555,290.00 |    0.0% |      2.80 | `memory_random_access`
|        8,813,073.00 |              113.47 |    1.7% |  110,100,499.00 |   29,845,812.00 |  3.689 |  31,457,284.00 |    0.0% |      0.10 | `sorted_memory_branch_mispredictions`
|       57,787,781.00 |               17.30 |    0.1% |  110,100,513.00 |  196,153,140.00 |  0.561 |  31,457,298.00 |   25.0% |      0.64 | `unsorted_memory_branch_mispredictions`
|       28,025,098.00 |               35.68 |    0.1% |       23,985.00 |   94,954,316.00 |  0.000 |       4,964.00 |    0.3% |      0.31 | `memory_copy_1MiB`
|      156,300,394.00 |                6.40 |    0.0% |        4,350.00 |       53,176.00 |  0.082 |         796.00 |   21.6% |      0.16 | `fwrite_1MiB_to_disk`
|          388,215.00 |            2,575.89 |    0.0% |       81,101.00 |       77,384.00 |  1.048 |      18,564.00 |    0.6% |      0.00 | `fseek_from_disk`
|       34,513,692.00 |               28.97 |    0.0% |       67,884.00 |      396,406.00 |  0.171 |      15,174.00 |    2.5% |      0.03 | `fread_1MiB_from_disk`

L1_random_access                      1.3 ns        4.3 cycles
L2_random_access                      7.6 ns       26.0 cycles
L3_random_access                     26.5 ns       89.8 cycles
memory_random_access                 83.3 ns      282.7 cycles
branch_miss_penalty                   6.2 ns       21.2 cycles
mutex_access                         16.5 ns       56.1 cycles
fseek_from_disk                    1516.5 ns      302.3 cycles
memory_copy_1MiB                 109473.0 ns   370915.3 cycles
fread_1MiB_from_disk             134819.1 ns     1548.5 cycles
fwrite_1MiB_to_disk              610548.4 ns      207.7 cycles

quick and dirty

numbers's output comes from simplistic, best-effort, low-cost, fast measurements.

For a more careful measurement, run the following commands:

$ sudo pip install pyperf # so that 'sudo pyperf system tune' can run
## On Arch linux, pyperf is available as a package
$ ./print-numbers

n.b. Read notes.md for benchmarking tips and more sophisticated tools. notes.md also contains information about internals of numbers.

There are also known issues.

no cycles column in the output

Depending on Linux configuration, numbers may not be able to access hardware performance counters. In this case, numbers does not print branch_miss_penalty row and cycles column in the output.

License

MIT License

Acknowledgement

numbers benefits from

About

a quick-and-dirty code that prints some of the "latency numbers every programmer should know"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published