From 0bc852e812a91c74a2edbc92ffae44960c94ffe9 Mon Sep 17 00:00:00 2001
From: Solal Pirelli <solal.pirelli@gmail.com>
Date: Wed, 25 Jan 2023 16:23:36 +0100
Subject: [PATCH] Improve documentation

---
 Dockerfile            |  5 +----
 ReadMe.md             | 37 ++++++++++++++++++++++++++++---------
 ada/ReadMe.md         |  7 ++++++-
 c/ixgbe/ReadMe.md     |  2 +-
 csharp/ReadMe.md      |  1 +
 experiments/ReadMe.md | 32 +++++++++++++++++---------------
 rust/ReadMe.md        |  2 --
 7 files changed, 54 insertions(+), 32 deletions(-)

diff --git a/Dockerfile b/Dockerfile
index 573ece9..45629d4 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -22,9 +22,6 @@ RUN apt-get update && \
     apt-get purge -y curl `# Cleanup...` && \
     rm -rf '/var/lib/apt/lists/*'
 
-COPY c /c
-COPY csharp /csharp
-COPY ada /ada
-COPY rust /rust
+COPY . /.
 
 CMD ["bash"]
diff --git a/ReadMe.md b/ReadMe.md
index 6671a95..aaafcc2 100644
--- a/ReadMe.md
+++ b/ReadMe.md
@@ -1,11 +1,12 @@
--- WEIRD: This MUST be of size 64, otherwise the card locks up quickly (even the heatup in the benchmarks doesn't finish)
+# TinyNF
 
-noinline run / inlined agent/queues run
+This repository contains the "TinyNF" driver codebase.
 
-# TinyNF
+It was originally associated with the paper ["A Simpler and Faster NIC Driver Model for Network Functions"](https://www.usenix.org/conference/osdi20/presentation/pirelli) (OSDI 2020).
+You can still find that version of the code, and the experimental scripts to reproduce that paper, in the `osdi20` branch.
 
-This repository contains code originally associated with the paper ["A Simpler and Faster NIC Driver Model for Network Functions"](https://www.usenix.org/conference/osdi20/presentation/pirelli) (OSDI 2020).
-It has been extended with more programming languages, a second driver model, and different scripts for experiments.
+It was extended for the paper ["Safe low-level code without overhead is practical"](https://conf.researchr.org/details/icse-2023/icse-2023-technical-track/18/Safe-low-level-code-without-overhead-is-practical) (ICSE 2023).
+The repo contains more programming languages, a second driver model, and scripts to reproduce the new paper's experiments.
 
 
 ## Code
@@ -13,10 +14,9 @@ It has been extended with more programming languages, a second driver model, and
 The code of the drivers is in the `ada`, `c`, `csharp`, and `rust` folders.
 We refer to "agents" in the code for the restricted TinyNF model and "queues" for the flexible DPDK model.
 
-All languages use a `Makefile.benchmarking` file to compile, which itself delegates to the language's native compiler / build system.
+All languages provide a `Makefile` to `build` and `format` the code, wrapping other build systems when needed.
 The following parameters are available:
 
-- `TN_CC` (`c` only): The compiler, tested with GCC and Clang
 - `TN_MODE`: The kind of driver:
   - `restricted` (default): The "restricted" model, which is the original TinyNF one
   - `const` (`ada`, `c`, and `rust` only): The restricted model with a constant number of devices, instead of detecting them at run-time
@@ -27,10 +27,29 @@ The following parameters are available:
 Note that despite needing extensions, the Rust driver does not support `TN_MODE=safe` because, due to Rust's ownership model,
 unsafe code _must_ be used in the hot loop for volatile reads and writes, whereas C# allows these reads and writes in safe code.
 
+All languages force inlining of the driver methods into a forced-not-inlined "run" method to ensure a fair comparison and to make extracting the hot loop code easy.
+
+All languages use a 64-bit integer to represent packet lengths, even though the card only supports 16 bits, because not doing so often causes the card to lock up in non-C implementations.
+(Yes, this sounds extremely weird, but that's what empirical evidence says...)
+
+
+## Dependencies
+
+To compile each language, you will need `make`, as well as a compiler:
+- `ada`: `gnat`, though any other compiler might work
+- `c`: `gcc` or `clang`, though any other C11 compiler should work
+- `csharp`: `dotnet`, version 7 or above
+- `rust`: `rustc`, a version that supports Rust 2021
+
+If you don't want to install those on your machine, we provide a `Dockerfile`, just run `docker build -t tinynf . ; docker run -it tinynf` (you might need `sudo` for Docker).
+This file is also useful if you want to know how to install the dependencies on any Ubuntu machine.
+(The Dockerfile does not include `clang-format`, which is used for `make format` in `c`, you'll have to install that manually if you want to auto-format the C code)
+
 
 ## Experiments
 
-The benchmarking scripts for NFs, which are independent of the rest, are in `benchmarking/`.
+The benchmarking scripts for network functions, which are independent of the rest, are in `benchmarking/`.
 
 The experiments presented in the paper, including replication instructions, are in `experiments/`.
-We've also provided the actual data collected on our hardware in `experiments/results_example`; you can rename the folder to `results` and run the scripts to plot it as per the instructions.
+We've also provided the actual data collected on our hardware in `experiments/results_example`;
+you can rename the folder to `results` and run the scripts to plot it as per the instructions.
diff --git a/ada/ReadMe.md b/ada/ReadMe.md
index db9ad05..08a8532 100644
--- a/ada/ReadMe.md
+++ b/ada/ReadMe.md
@@ -1 +1,6 @@
-This is the Ada version. Note that the queues are split in more files because the RX and TX functions need arrays of a generic argument, that requires a generic package, and GNAT requires 1 package per file.
+This is the Ada version.
+
+Note that the queues are split in more files because the RX and TX functions need arrays of a generic argument, that requires a generic package, and GNAT requires 1 package per file.
+
+Overall, this is a best-effort implementation in terms of code cleanliness, it was not written by Ada experts, far from it.
+But it works!
diff --git a/c/ixgbe/ReadMe.md b/c/ixgbe/ReadMe.md
index 293ea7b..ca103ed 100644
--- a/c/ixgbe/ReadMe.md
+++ b/c/ixgbe/ReadMe.md
@@ -1,5 +1,5 @@
 All references in the code are to the Intel 82599 Data Sheet unless otherwise noted.
-It used to be publicly available but the link is dead, though you may have luck googling it.
+It used to be publicly available but the link is dead, though you may have luck looking it up.
 
 ### Interpretations
 
diff --git a/csharp/ReadMe.md b/csharp/ReadMe.md
index d2131d5..1930474 100644
--- a/csharp/ReadMe.md
+++ b/csharp/ReadMe.md
@@ -1,2 +1,3 @@
 This is the C# version, including the C# extensions in `TinyNF.Unsafe`.
+
 The code is split into three projects so that `TinyNF` can be compiled without passing the `/unsafe` switch to the compiler, ensuring it contains no unsafe code.
diff --git a/experiments/ReadMe.md b/experiments/ReadMe.md
index 0a22064..cacf806 100644
--- a/experiments/ReadMe.md
+++ b/experiments/ReadMe.md
@@ -1,21 +1,27 @@
 # Experiments
 
-This folder contains experiments from the paper.
+The ICSE 2023 paper's central claim is that it is possible to write safe code with no run-time overhead compared to C.
+This is implemented as two driver models in C, Ada, C#, and Rust.
 
-**Table 2**:
-`cd code-metrics ; ./tabulate-metrics.sh`
-(this may slightly differ from the paper depending on your exact compiler versions)
+_By definition, you are unlikely to get the exact same results as the paper since it depends on your exact compiler versions and your CPU.
+However, the relative claims should hold._
 
-**Figures**:
-Check out the prerequisites below, `cd perf`, then `./bench.sh`, which will take a few hours.
-Then `./plot.sh` assuming you have Python with Matplotlib; run `. setup-virtualenv-graphing.sh` on Ubuntu to create a virtualenv with it if needed.
 
+## Theory (minutes)
 
-## Prerequisites
+You will need `cloc` in addition to the compilers mentioned in the top-level readme; or use the provided Dockerfile.
 
-Most of the experiments are about performance, measured with benchmarks.
+Run `cd code-metrics ; ./tabulate-metrics.sh` which will generate an `assembly/` directory containing the assembly code for all hot loops in all drivers.
+This can be manually checked for a lack of compiler-inserted bounds checks.
+That script also outputs **Table 2** from the paper, and takes less than a minute.
 
-To run these benchmarks, you need two machines running Linux:
+## Practice (hours)
+
+For the **figures**, with the exception of Figure 2 which is a simplified form of `rust/src/lifed.rs`, you need a proper testbed.
+Check out the prerequisites below, `cd perf`, then `./bench.sh`, which will take a few hours.
+Then `./plot.sh` assuming you have Python with Matplotlib; run `. setup-virtualenv-graphing.sh` on Ubuntu to create a virtualenv with it if needed.
+
+To run the benchmarks, you need two machines running Linux:
 - A "device under test" machine with two Intel 82599ES NICs on the same NUMA node, from which you will run the experiment scripts
 - A "tester" machine connected to the other one by two 10G Ethernet cables
 
@@ -33,12 +39,8 @@ Assuming a 2-CPU machine whose second CPU has cores 8 to 15, we recommend the fo
 - `idle=poll cpuidle.off=1`: Force the CPU to spin instead of using waits for idling
 - `intel_pstate=disable`: Allow Linux to set the CPU frequency via `cpupower` instead of letting the Intel driver choose
 
-You will also need the following software, in addition to the compilers for each language:
-- the NativeAOT dependencies: https://learn.microsoft.com/en-us/dotnet/core/deploying/native-aot/
-
+You will also need the following software on the device under test machine, in addition to the dependencies mentioned in the top-level readme:
 - The library `libtbb2`, available under that name in most package repositories
-- The build tool `make`, available under that name in most package repositories
-- The shell utility `cloc`, available under that name in most package repositories
 - The shell utility `cpupower`, available under names such as `linux-tools-generic` (Ubuntu) in package repositories
 
 Due to how long some of these scripts take, if you are running them via SSH, you may want to use an utility such as `byobu`, `tmux`, or `screen`,
diff --git a/rust/ReadMe.md b/rust/ReadMe.md
index 5edeabf..ccbeac0 100644
--- a/rust/ReadMe.md
+++ b/rust/ReadMe.md
@@ -1,3 +1 @@
 This is the Rust version. The extensions are in `src/lifed.rs`.
-
-Useful note: to auto-format, run `shopt -s globstar; rustfmt src/**/*.rs`