Skip to content
Pete Heist edited this page Nov 29, 2024 · 8 revisions

Antler is a tool for network and congestion control testing. The name stands for Active Network Tester of Load & Response, where '&' == Et. :)

Antler can be used to set up and tear down test environments, coordinate traffic flows across multiple nodes, gather data using external tools like tcpdump, and generate reports and plots from the results. It grew out of testing needs for SCE, and related congestion control projects in the IETF.

Why Antler?

Running tests with existing tools can be time consuming and error prone, as it involves more than just generating traffic and emitting stats. It includes:

  • setting up and tearing down test environments
  • orchestrating actions across multiple nodes
  • running multiple tests with varied parameter combinations
  • re-running only some tests while retaining prior results
  • running external tools to gather pcaps or other data
  • gathering results from multiple nodes into a single source of truth
  • emitting results in different formats for consumption
  • saving results non-destructively so prior work isn't lost
  • making results available on the web
  • configuring all of the above in a common way, to avoid mistakes

Antler is an attempt to address the above. The test environment is set up and torn down before and after each test, preventing configuration mistakes and "config bleed" from run to run. The test nodes are auto-installed and uninstalled before and after each test, preventing version mismatch and dependency problems. Tests are orchestrated using a hierarchy of serial and parallel actions that can be coordinated over the control connections to each node. Results, logs and data from all the nodes are gathered into a single data stream, saved non-destructively, and processed in a report pipeline to produce the output. Partial test runs allow re-running only some tests, while hard linking results from prior runs so a complete result tree is always available. Results may be published using an internal, embedded web server. Finally, all of the configuration is done using CUE, a data language that helps avoid config mistakes and duplication.

Features

Tests

  • Auto-installed test nodes that run:
    • Locally or via ssh
    • Optionally in Linux network namespaces
  • Builtin traffic generator in Go:
    • Support for tests using stream-oriented and packet-oriented protocols (for now, TCP and UDP)
    • Configurable UDP packet release times and lengths, supporting anything from isochronous, to VBR or bursty traffic, or combinations in one flow
    • Support for setting arbitrary sockopts, including CCA and the DS field
    • HMAC signing to protect servers against unauthorized use
  • Sampling of Linux socket stats via netlink sock_diag subsystem (e.g. delivery rate, TCP RTT, etc.)
  • Configuration using CUE, to support test parameter combinations, schema definition, data validation and config reuse
  • Configurable hierarchy of "runners", that may execute in serial or parallel across nodes, and with arbitrary scheduled timing (e.g. TCP flow introductions on an exponential distribution with lognormal lengths)
  • Incremental test runs to run only selected tests, and hard link the rest from prior results
  • System runner for system commands, e.g. for setup, teardown, data collection such as pcaps, and mid-test config changes
  • System information gathering from commands, files, environment variables and sysctls

Results/Reports

  • Time series and FCT plots using Google Charts
  • Tables of flow metrics:
    • For streams: completion time, length and goodput
    • For packet flows: OWD, RTT, lost, late, early and duplicates
  • Generation of index.html pages of tests, with links to plots, log files, pcaps, system information, etc.
  • Embedded web server to serve results
  • Plots/reports implemented with Go templates, which may eventually target any plotting package
  • Optional result streaming during test (may be configured to deliver only some results, e.g. logs, but not pcaps)

Caveats

There are several caveats which are worth considering before working with Antler.

CUE Syntax

CUE has helped with Antler's configuration, but developing tests may take longer than expected. CUE is a new language, only at version 0.1, that isn't always intuitive at first. Some aspects of the configuration require cumbersome or confusing syntax.

CUE Evaluator

CUE's evaluator is also currently being re-written, and it has a problem with memory consumption, as reported here, that makes it unusable for large test packages. For example, sce-tests contains 376 tests and takes 10 GB of resident memory size to parse the config. Once the config is parsed, running the tests takes only a small fraction of that. Antler is currently pinned to CUE v0.5 due to this problem. To accomodate larger packages, either increase the amount of physical memory available, or split large test packages into multiple packages when necessary.

Visualization

Antler's plots could be more flexible than they are. There are time series and FCT plots that use Google Charts, and one may set arbitrary config using the Options field from CUE, but several important things are not possible:

  • Visualizing data across multiple test results, e.g. candlestick plots
  • Adding text to provide context for the results
  • Allowing the user to choose what's plotted
  • Targeting plotting packages other than Google Charts

More work is needed on data visualization with Antler.

UDP Latency Accuracy Limits

The node and its builtin traffic generators are written in Go. This comes with some system call overhead and scheduling jitter, which reduces the accuracy of the UDP latency results somewhat relative to C/C++, or better yet timings obtained from the kernel or network. The following comparison between ping and irtt gives some idea (note the log scale on the vertical axis):

Ping vs IRTT

While the UDP results are still useful for tests at Internet RTTs, if microsecond level accuracy is required, external tools should be invoked using the System runner, or the times may be interpreted from pcaps instead. In the future, either the traffic generation, or the entire node itself, may be rewritten in another language.