-
Notifications
You must be signed in to change notification settings - Fork 5
/
Copy pathp2831.tex
308 lines (187 loc) · 49.7 KB
/
p2831.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
\input{wg21common}
\begin{document}
\title{Functions having a narrow contract should not be \tcode{noexcept}}
\author{ Timur Doumler \small(\href{mailto:[email protected]}{[email protected]}) \\
Ed Catmur \small(\href{mailto:[email protected]}{[email protected]}) }
\date{}
\maketitle
\begin{tabular}{ll}
Document \#: & P2831R0 \\
Date: &2023-05-15 \\
Project: & Programming Language C++ \\
Audience: & Library Evolution Working Group
\end{tabular}
\begin{abstract}
The Lakos Rule is a long-standing design principle in the C++ Standard Library. It stipulates that a function having a narrow contract should not be declared \tcode{noexcept}, even if it is known to not throw when called with valid input. In this paper, we demonstrate why the Lakos Rule is still useful and important today and should not be removed.
\end{abstract}
\section{Introduction}
\label{sec:intro}
C++ functions --- in the C++ Standard Library or in other places --- can have \emph{preconditions}, which are a form of \emph{contract}. A function that has no preconditions on its input (parameter) values or on the state (object state or global state) accessible from it --- i.e., a function that has defined behaviour for any combination of input values and accessible state --- is said to have a \emph{wide contract}. Examples of such functions in the C++ Standard Library are \tcode{std::vector::at} and \tcode{std::vector::size}.
If such a function is required to never throw an exception (or if it is somehow known that it will never throw an exception), it may be declared \tcode{noexcept} (conditionally or unconditionally). This is the case for \tcode{std::vector::size}.
By contrast, a function that has preconditions --- i.e., a function whose behaviour is undefined\footnote{It is sometimes useful to distinguish between \emph{library undefined behaviour} or \emph{soft UB} (violating the preconditions of a function), which might be recoverable if the violation is detected at the time of the call, and \emph{language undefined behaviour} or \emph{hard UB} (hitting core undefined behaviour --- see \cite{P1705R1} --- inside the implementation of the function), which is unrecoverable, although the C++ Standard itself does not make such a distinction.} for some combination of input values and accessible state, which we can call \emph{invalid} --- is said to have a \emph{narrow contract}. Examples of such functions in the C++ Standard Library are \tcode{std::vector::operator[]} and \tcode{std::vector::front}. Invoking the former with an out-of-bounds index or invoking either function on an empty vector will result in undefined behaviour.
A long-standing design principle in the C++ Standard Library has been that a function having a narrow contract should not be declared \tcode{noexcept}, even if it is known to never throw an exception for a \emph{valid} combination of input values and accessible state. When a function having a narrow contract is obliged to not throw, the function should nevertheless \emph{not} be declared \tcode{noexcept} but merely specified as \emph{Throws: nothing}. This design principle allows for highly effective testing strategies that involve throwing exceptions as a way of diagnosing \emph{contract violations} --- i.e., bugs introduced by calling the function with an invalid combination of input values and accessible state (calling the function \emph{out of contract}). This design principle is also known as the \emph{Lakos Rule}.
The Lakos Rule was first proposed in \cite{N3248} and adopted with \cite{N3279}. An updated version of the rule was codified into policy in \cite{P0884R0}. See \cite{O'Dwyer2018} for a more detailed summary.
More recently, \cite{P1656R2} argued that the Lakos Rule should be abandoned as a design principle. According to this paper, functions that are known to never throw an exception for a valid combination of input values and accessible state should always be declared \tcode{noexcept}, regardless of whether they have a wide or a narrow contract. Further, \cite{P2148R0} proposed adopting a new standing document with design guidelines for the evolution of the C++ Standard Library that move away from the Lakos Rule.
This paper makes the case that the Lakos Rule is still useful and important today and must be retained as a design principle for the C++ Standard Library. In section \ref{sec:negativetest}, we compare the various known techniques for negative testing, demonstrating that the Lakos Rule is essential for implementing negative testing effectively. In section \ref{sec:casestudies}, we present case studies from real-world codebases where the Lakos Rule is central to maintaining an effective testing strategy. In section \ref{sec:stdlib}, we argue why the Lakos Rule is not only important in such third-party codebases, but also for the C++ Standard Library itself. In section \ref{sec:noexcept}, we discuss why the urge to excessively use \tcode{noexcept} --- often the reason why C++ developers do not follow the Lakos Rule --- is misguided and what the actual use case for \tcode{noexcept} is. Finally, in section \ref{sec:contracts}, we consider recent developments for standardising a C++ Contracts facility and discuss why the Lakos Rule is still needed if we have such a facility.
\section{Negative testing}
\label{sec:negativetest}
Unit tests are an established engineering practice to ensure software quality and a crucial part of the software test pyramid. Let us consider how we would unit test a function having a narrow contract, such as \tcode{std::vector::front}.
Writing unit tests for cases in which \tcode{front} is being called in contract and therefore has defined behaviour is straightforward. We establish valid combinations of input values and accessible state and test whether the function gives the expected output in each case:
\begin{codeblock}
std::vector<int> v = {1};
REQUIRE(v.front() == 1);
// etc.
\end{codeblock}
Here, \tcode{REQUIRE} is some macro provided by the unit test framework to verify that the given predicate evaluates to \tcode{true}, report success or failure, and continue the execution of the test suite.
Now, what happens if we call \tcode{front} out of contract, i.e., on an empty vector? In this case, the behaviour is undefined. Calling \tcode{front} on an empty vector is therefore unconditionally a bug. This specification is necessary to achieve maximum performance, e.g., in a release build, where we cannot afford to check the precondition at run time. In a debug build, however, such a precondition check is possible and is in fact critically important to prevent the introduction of such bugs.
Since C++23 lacks a language-level Contracts facility (see section \ref{sec:contracts}), we need to use a library-based solution to write the precondition check. Typically, this check will be implemented with some kind of assertion macro at the beginning of the function body:
\begin{codeblock}
T& front() {
ASSERT(!empty());
// implementation
}
\end{codeblock}
Precondition checks are code and, just like any other code, ought to be tested. We therefore need to write a unit test to ensure that the precondition check has in fact been added. This kind of testing is sometimes called \emph{negative testing}:
\begin{codeblock}
std::vector<int> v; // empty
REQUIRE_ASSERT_FAIL(v.front());
\end{codeblock}
Negative testing is critically important: without a negative test, we cannot be sure that the developer of the \tcode{front} function considered this case and added a check that will alert users of \tcode{front} about out-of-contract calls and prevent them from introducing bugs.
But how do we write such a negative test? How do we implement \tcode{REQUIRE_ASSERT_FAIL} in our testing framework?
Once we hit the \tcode{ASSERT} macro and the contract check fails, continuing to execute the body of the function is no longer meaningful; the code will either crash or exhibit some other form of undefined and potentially harmful behaviour. To continue running our unit test suite, we therefore need a way to exit the function --- other than by returning a value --- at the point where the contract violation occurred and communicate detailed information about the contract violation back to the testing framework. Below we discuss the known strategies to achieve such a controlled function exit.
\subsection{Exception based}
The most natural, portable, and effective way to exit the function without continuing to execute the function body (which would invoke undefined behaviour) is to throw an exception at the point where the contract violation occurred. We can define our \tcode{ASSERT} macro as follows\footnote{At Cradle, we have a slightly more sophisticated definition: when debugging locally, i.e., if a debugger is attached, the \tcode{ASSERT} macro will trigger a breakpoint on contract violation, using utilities like the ones proposed in \cite{P2514R0}; otherwise (that is, when running the test suite on CI or locally but without a debugger attached), it will throw an \tcode{AssertFail} exception as shown here.}:
\begin{codeblock}
#if TEST_ASSERTIONS
#define ASSERT(expr) if (!expr) throw AssertFail();
#else
// other possible actions: ignore, assume, log and continue, log and terminate
#endif
\end{codeblock}
Then, in \tcode{TEST_ASSERTIONS} mode (which will often, but not always, correspond to debug mode), we can define our \tcode{REQUIRE_ASSERT_FAIL} to verify that an exception of type \tcode{AssertFail} has been thrown, report success or failure, and continue the execution of the test suite. This is efficient, portable, and straightforward: every modern C++ testing framework provides a way to check for a thrown exception of a particular type, and if necessary it is easy to write such a check by hand.
Another important advantage of exception-based negative testing is that we can communicate an arbitrary amount of information about the contract violation back to the testing framework via the thrown exception object. \cite{P1656R2} repeats the canard that stack unwinding destroys information. This claim might be true in the na\" ive case, but any sophisticated implementation will collect the relevant information before stack unwinding, either immediately before throwing the exception or (for more general benefit) at the end of the search phase, under the control of the catch block but before stack unwinding begins.
The only issue with exception-based negative testing as described above is that it no longer works if the function under test is declared \tcode{noexcept}. Throwing an \tcode{AssertFail} out of a \tcode{noexcept} function would immediately result in \tcode{std::terminate}, bringing down the whole test suite.
\subsubsection{Following the Lakos Rule}
The obvious way to solve the \tcode{noexcept} problem is to not declare a function with a narrow contract \tcode{noexcept}, even if we know that the function will never throw when called in contract. In other words, exception-based negative testing is straightforward if we just follow the Lakos Rule.
If the Standards committee abandons the Lakos Rule as a design principle (as proposed in \cite{P1656R2} and \cite{P2148R0}), functions such as \mbox{\tcode{std::vector::front}} might be specified as \tcode{noexcept} in a future standard. This new direction would make writing negative tests (and, therefore, preventing bugs from being introduced because of out-of-contract calls and missing contract checks) much harder. In the remainder of this section, we discuss various workarounds and their shortcomings compared to the straightforward exception-based technique that the Lakos Rule enables.
\subsubsection{Conditional \tcode{noexcept} macro}
\label{subsubsec:conditional}
A workaround used by some libraries is to introduce a macro along the lines of
\begin{codeblock}
#if TEST_ASSERTIONS
#define MY_NOEXCEPT
#else
#define MY_NOEXCEPT noexcept
#endif
\end{codeblock}
Then, we can annotate all functions having a narrow contract with \tcode{MY_NOEXCEPT} instead of \tcode{noexcept} proper. Thus functions having a narrow contract can be \tcode{noexcept} in production, and at the same time, we can use exception-based negative testing on them when compiled in \mbox{\tcode{TEST_ASSERTIONS}} mode.
This option, however, is unsatisfactory because we effectively end up unit testing not our actual code but code compiled with a different specification, which may result in different behaviour: switching the \tcode{noexcept} specification of a function depending on the build mode can trigger different code paths being taken. This is observable by users (for example, turning moves into copies) and causes confusion. Software engineering best practice fairly demands that we test the actual code that is built for production, which is not possible with this technique. That is why libc++ ultimately decided against this approach after having introduced it; see \ref{subsec:major}. See also \cite{P2834R0} which explains from first principles why this approach is such a bad idea.
\subsection{\tcode{setjmp} and \tcode{longjmp}}
Another way to exit the function from our \tcode{ASSERT} macro on contract violation is to use \tcode{setjmp} and \tcode{longjmp}. However, this technique does not work for negative testing. With most compilers,\footnote{Notably, Microsoft's implementation of \tcode{setjmp} and \tcode{longjmp} \emph{does} perform stack unwinding with local object destruction, as is done for \tcode{throw} and \tcode{catch} (see \cite{MSVCDocLongjmp}), while GCC and Clang do not.} when using \tcode{setjmp} and \tcode{longjmp} instead of \tcode{throw} and \tcode{catch}, the stack is not unwound and destructors of objects on the stack are not called. The C++ Standard specifies in [csetjmp.syn]:
\begin{adjustwidth}{0.5cm}{0.5cm}
The contents of the header \tcode{<csetjmp>} are the same as the C standard library header \tcode{<setjmp.h>}.
The function signature \tcode{longjmp(jmp_buf jbuf, int val)} has more restricted behavior in this document. A \tcode{setjmp}/\tcode{longjmp} call pair has undefined behavior if replacing the \tcode{setjmp} and \tcode{longjmp} by \tcode{catch} and \tcode{throw} would invoke any nontrivial destructors for any objects with automatic storage duration.
\end{adjustwidth}
The specification above means that in practice, we will immediately run into undefined behaviour when performing negative testing of any C++ code involving objects having nontrivial destructors. Most real-world C++ code calls such destructors. But even if the behaviour were defined, if we run thousands of unit tests involving data structures that allocate significant amounts of memory on the heap, we end up with an unacceptable number of memory leaks (and memory usage is often an integral part of thorough unit testing). We also break the program logic in the presence of other resources that rely on RAII, such as \tcode{std::lock_guard}. For all these reasons, this approach is not viable.
\subsection{Child threads}
Another strategy for negative tests that does not involve throwing exceptions or performing death tests is to invoke the function under test in a child thread. On contract violation, the \tcode{ASSERT} macro can save some information about the violation and then lock the thread (by putting it to sleep indefinitely, or perhaps spinning in an infinite loop). \tcode{REQUIRE_ASSERT_FAIL} can then verify that this has happened.
This approach is slightly more comprehensible than \tcode{setjmp} and \tcode{longjmp} and does not suffer from the undefined behaviour issue but still has all the other drawbacks of \tcode{setjmp} and \tcode{longjmp}, such as leaking memory (invalidating unit tests that track such leakage) and breaking any program logic relying on RAII. This approach also leaks one thread for every test case.
\subsection{Stackful coroutines}
Ville Voutilainen recently suggested yet another approach for negative testing without throwing exceptions or performing death tests. While this approach is in many ways a thought experiment, and we are not aware of any testing framework or codebase successfully using this approach, it has been successfully prototyped\footnote{For a prototype implementation in standard C++ that can be experimented with on Compiler Explorer, see \hyperref[https://godbolt.org/z/obsfvzrqh]{\tcode{https://godbolt.org/z/obsfvzrqh}}.}.
The idea is that on contract violation, the \tcode{ASSERT} macro would yield to a cooperative scheduler. The tests would have to be written in such a way that the run of a subsequent test would be triggered by the event loop in conjunction with triggering the previous test's result verification. A failed test will run a nested event loop (which is the scheduler yield) and get stuck there, without proceeding to run the body of the function called out of contract. A successful test will just run its code and then call the nested event loop. In both the failed and the successful cases, the event loop will subsequently run the test result verification of the test, and then the event loop will run the next test. The sequence of tests thus becomes a sequence of recursive calls, and each failed test behaves effectively like a suspended stackful coroutine.
Just like the \tcode{setjmp}/\tcode{longjmp} and child thread approaches, this approach would never call destructors of any parameters or objects created by the test call, therefore leaking memory (invalidating unit tests that track such leakage) and breaking any program logic relying on RAII. In addition, the call stack would keep growing with every test call, consuming a large and unbounded amount of stack space. Finally, the whole test suite would have to be arranged in a very particular way in order to make this technique work, and would require a testing framework that offers the required event loop machinery.
\subsection{Signals}
Signals have been suggested as another way to exit the function from our \tcode{ASSERT} macro. However, signals do not help us here either. First of all, although synchronous signals are available on POSIX platforms, they are not available on Windows and are therefore not viable for cross-platform development. More importantly, if, on contract violation, we raise a signal in \tcode{ASSERT} and then install a custom signal handler to handle it, we can do only one of two things at the end of such a signal handler: either return control back to the function that raised the signal, or terminate the program. Using signals is therefore no different from using any other callback-based approach (see above).
\subsection{Death tests}
\label{subsec:deathtests}
If we cannot continue executing the body of the function under test but have no practical way to exit the function other than by terminating the entire process, the only remaining option for negative testing is to implement it as a \emph{death test}. In a death test, the code under test is run in a separate process. A contract violation in the \tcode{ASSERT} macro leads to termination of this process with some error message. \tcode{REQUIRE_ASSERT_FAIL} verifies that the process has been terminated and that an error message has been triggered. In principle, this approach works, but several drawbacks make it a nonviable solution for many codebases.
We are aware of three ways to implement death tests: fork based, clone based, and spawn based.
\subsubsection{Fork based}
In a fork-based death test, each negative test is run in a forked process. This kind of test works reasonably well on platforms having a fast, reliable \tcode{fork()}. In practice, use of this fork-based approach limits us to UNIX-like platforms, such as Linux and macOS. Fork-based death tests can therefore be a viable strategy if your C++ library targets only these platforms.
When targeting Windows, embedded platforms, or the browser, this approach either does not scale due to a much higher runtime overhead or is outright impossible due to lack of multiprocess support: this is a major reason why most C++ unit test frameworks do not support death tests. From the five most popular C++ unit test frameworks, only GoogleTest supports death tests, while Catch2, Boost.Test, CppTest, and DocTest do not.
Another drawback is that even on platforms where death tests can be implemented efficiently, they can carry only a small amount of information about the contract violation; by using \tcode{std::_Exit} instead of \tcode{std::abort}, one can communicate up to 8 bits of information. This amount of diagnostic information is very meagre compared to the unlimited amount of information (such as the source location and, in advanced usage, the values of operands) available to be carried on an exception from a failed assert handler. Some more information can be carried through standard streams, but this approach is fragile and requires the rigmarole of serialisation and deserialisation.
\subsubsection{Clone based}
On Linux, \tcode{clone()} can be used instead of \tcode{fork()}. This approach has the advantage that \tcode{clone()} is less likely than \tcode{fork} to cause the child to hang when the parent process has multiple threads (see \cite{GTestDocDeathTests}). However, \tcode{clone()} is even less portable than a fork-based death test, since it works only on Linux.
\subsubsection{Spawn based}
A different flavour of death test that does not depend on \tcode{fork()} or \tcode{clone()} is a spawn-based death test, where the testing framework spawns a new process for each negative test. But spawn-based death tests have several drawbacks compared to fork-based and clone-based death tests: typically, they require adoption of an external, usually non-C++, testing framework (DejaGNU, lit, CTest, make); they require moving test code into other source files, making it more difficult to track; and they require building the state for each test from scratch. On the other hand, fork-based death tests (and exception-based negative tests) can build up and reuse state. All this makes spawn-based negative tests orders of magnitude more cumbersome to write and the adoption of such tests much less likely, leading to worse software quality.
Like fork-based death tests on non-UNIX-like platforms, spawn-based death tests also suffer from a very high performance overhead. A mid-sized test suite may have several thousand negative tests. The overhead of spawning that many processes, even on platforms where that is relatively fast, is enough to turn a test suite that runs in under a second into one that takes minutes. That performance degradation alone precludes test-on-save, red-green-refactor, and other modern development processes.
\section{Case studies}
\label{sec:casestudies}
A well-known codebase that uses exception-based negative testing, which in turn relies on the Lakos Rule as a design principle, are Bloomberg's BDE libraries. However, Bloomberg is by no means the only company relying on this strategy. In fact, both authors of this paper work at companies that are entirely unrelated to Bloomberg and whose codebases make extensive use of exception-based negative testing, rely on the Lakos Rule, and would be unable to effectively test their code without it. In this section, we discuss our own experience with using the Lakos Rule in practice.
\subsection{Timur Doumler: \emph{Cradle}}
In 2018, I cofounded the music technology company Cradle (\hyperref[https://cradle.app]{\tcode{https://cradle.app}}) and became its CTO. I was in the enviable position of being able to start a brand new codebase from scratch, following the latest and best engineering practices and hiring a new team of developers that shared our vision.
From the start, the core guiding principle for building Cradle's software stack and engineering culture was a strong focus on code quality. One of the principles we introduced to achieve this goal was to aim for a very good unit test coverage. For whatever reason, focusing on automated testing in general and unit testing in particular tends to be less common in music production software than in other industries. We learned in practice that, by having a strong culture of unit testing and test-driven development (TDD), we were able to deliver software at a higher quality standard, with far fewer bugs and crashes reported by users.
The parts of our codebase where TDD proved to be particularly effective were the foundational, generic C++ libraries that the rest of the codebase relied upon. In particular, testing our code for contract violations (i.e., negative testing) has proven to be an important part of keeping our code quality high and reducing the number of newly introduced bugs.
As we started practicing negative testing, however, we immediately ran into the problems discussed in section \ref{sec:negativetest} above. We experimented with death tests (which our chosen unit testing framework didn't offer), POSIX signals, \tcode{setjmp} and \tcode{longjmp}, and making \tcode{noexcept} conditional on whether we are in unit test mode. We found that exception-based negative testing, when combined with the Lakos Rule as a library design principle, is the most straightforward and effective method for our use case (i.e., C++ libraries for cross-platform audio software that should run --- and therefore be tested on --- macOS, Linux, and Windows). All alternative approaches we explored had worse tradeoffs and were ultimately nonviable for our use case.
While researching this topic, I asked C++ developers from other companies, including the maintainer of the unit testing framework we were using at the time, about negative testing. According to many of them, negative testing was ``not a thing'', ``outside of the realm of unit testing'', and so on. I found this attitude surprising, as I had proof from my own experience that negative testing can be very effective at preventing real bugs. The only explanation I can think of is that, with the Lakos Rule not being as widely used outside of the C++ Standard Library, many C++ developers have been taught to sprinkle \tcode{noexcept} all over their codebase (see also section \ref{sec:noexcept}), which makes negative testing very difficult, slow, and cumbersome. This abuse of the \tcode{noexcept} specifier, in turn, means that many developers never get to discover the benefits of practicing negative testing and thus are unaware of them. Consider also that many C++ developers work in smaller companies or startups that do not have the resources to develop their own unit testing frameworks (and ideally should not have to).
\subsection{Ed Catmur: \emph{Maven}}
At Maven Securities (\hyperref[https://www.mavensecurities.com/]{\tcode{https://www.mavensecurities.com/}}), we use C++ to develop in-house software for trading on financial markets. The codebase has always been written to a high level of quality, but as the company has grown and broadened geographically, testing has become ever more crucial to maintaining a low defect rate while enabling programmers from a wide diversity of backgrounds to contribute to shared libraries in a spirit of open collaboration.
The unique requirements of the finance industry often require us to write specialised versions of Standard components (such as containers, having fine-tuned performance, latency, or memory characteristics) yet retain API compatibility (as closely as possible) with Standard and open-source libraries. This approach allows us to perform drop-in replacement of our code such that it stays readily comprehensible to coworkers, other teams, and new hires.
While negative testing is particularly prevalent in our foundational libraries, we also find it useful in higher-level components. In our line of business, warranting that bugs in market-facing code can be quickly detected and addressed during development is essential. Should defects reach production, we must also guarantee that the behaviour in the presence of defects is predictable and fail-safe and that diagnostics resulting from failure are genuinely useful to front-line support.
In particular, we approach \tcode{noxecept} in a spirit of wariness; while it has some algorithmic performance benefits in theory, the most performance-sensitive code is inlined and allocation-free and thus is highly unlikely to benefit in practice (see also section \ref{sec:noexcept}). On the other hand, the potential for \tcode{noexcept} to convert an exception to program termination makes widespread use of \tcode{noexcept} highly unsafe; a trading program that encounters a fault, throws an exception out to the main I/O loop, and shuts down safely is much preferred to a process that terminates immediately. Such abrupt termination potentially leaves connections in an open state and open orders on the exchange, along with exposure to financial hazard and regulatory penalties. In this context, the Lakos Rule feels entirely natural: functions having narrow contracts are either inlined, in which case \tcode{noexcept}
is largely irrelevant, or they are not, in which case --- even if exception-free initially --- the code is unlikely to stay that way through development.
Although we use death tests where unavoidable, we find the overhead (roughly 1000-fold for fork-wait on Linux compared to throw-catch) to be a considerable impediment to achieving the code coverage and the rapid test-develop cycle to which we aspire. Additionally, the impracticality of fork-wait on Windows means that code using such tests lacks full coverage across the compilers and platforms we target.
In my experience, developers arriving at Maven and encountering our codebase for the first time are at least appreciative of the low defect rate that negative testing allows us to achieve and usually, whether from a background in the finance industry or from outside, keenly adopt the tooling that our framework provides for negative testing. To me, this anecdotal evidence indicates that the Lakos Rule is readily comprehensible, at least to developers who have seen the benefits it enables.
\section{Why we need the Lakos Rule in the C++ Standard Library}
\label{sec:stdlib}
Despite the usefulness of the Lakos Rule in real-world codebases, \cite{P1656R2} argues that it should no longer be applied to the specification of the C++ Standard Library itself because existing major implementations of the C++ Standard Library do not actually use exception-based negative testing. This is an unreasonable argument, as we will demonstrate in this section.
\subsection{Major C++ Standard Library implementations}
\label{subsec:major}
Let us consider the three major implementations of the C++ Standard Library: libstdc++, libc++, and the Microsoft STL. libstdc++ and libc++ both use death tests for negative testing, while the Microsoft STL does not appear to negative test narrow contract preconditions at all.
libstdc++ chose to use death tests based on the DejaGnu framework. They have considered exception-based tests but found that they would break backward compatibility.
libc++ initially chose to use exception-based tests for ease of testing and other reasons but ran into the familiar issue that they could not apply this technique to functions declared \tcode{noexcept}. Since they could not remove \tcode{noexcept} due to backward compatibility, they introduced the conditional \tcode{noexcept} macro \tcode{_NOEXCEPT_DEBUG}, as described in section \ref{subsubsec:conditional}. They later found that \tcode{_NOEXCEPT_DEBUG} was a ``horrible decision'' (see \cite{LLVMReviewD59166}) because it was observable to the user and changed the behaviour of the program. Left with no other option, they switched to fork-based death tests, which are much slower and run only on UNIX-like platforms.
This anecdote does not demonstrate that exception tests are a bad thing but rather that if they are to be used, the library should be designed for their use from the start. The corollary is that if library implementors (especially any other than the three major ones) are restricted to using death tests, as would be the result of \cite{P1656R2}, they would be able to fully test only on UNIX-like platforms (no Windows, no bare metal, no browser). Adopting \cite{P1656R2} would do irreversible damage: if we were to reverse such a decision in the future, drawing any benefit would be difficult since users would have already come to depend on functions having narrow contracts being declared \tcode{noexcept}.
\subsection{Nonmajor and non-Standard implementations}
In addition to the three major implementations, a number of nonmajor implementations as well as quasi-implementations are available: libraries that do not implement the C++ Standard Library in its entirety, but a subset of it, or a superset of a subset. Libraries like Bloomberg's BSL, Electronic Arts' EASTL, NVIDIA's C++ Standard Library, and others fall into this category.
Beyond that, many more C++ libraries do not claim to be ``standard'' libraries but implement drop-in replacements for certain parts of the C++ Standard Library. Often, they differ in implementation to account for industry-specific requirements but follow the Standard API as closely as possible for compatibility. We have such libraries at Cradle, providing alternative implementations of containers, algorithms, allocators, and more; many companies relying on C++ have similar libraries.
Many of these libraries use exception-based testing and rely on the Lakos Rule. If the C++ Standard Library changes its design guideline in this regard, those libraries will have the choice between either having an API that is no longer following the design of the Standard or moving away from exception-based negative testing. In practice, the latter means either switching to death tests (which, as discussed, introduces a lot more complexity and overhead and, in many cases, is outright impossible) or giving up on negative testing entirely (which significantly reduces test coverage and compromises code quality).
\subsection{\emph{Throws: nothing} vs. \tcode{noexcept} as a design guideline}
Note that the C++ Standard allows implementations to unilaterally tighten \emph{Throws: nothing} to \tcode{noexcept} if they so choose --- and some do so --- and still be conforming. Therefore, abolishing the Lakos Rule in the C++ Standard Library specification would do all the aforementioned damage to users relying on it, while not actually benefitting anyone. If declaring functions having narrow contracts \tcode{noexcept} provides a positive tradeoff for a particular implementation of the C++ Standard Library, it can continue to do so without changing the status quo.
\cite{P1656R2} claims that the difference between specifying \emph{Throws: nothing} in the C++ Standard and specifying \tcode{noexcept} in a particular implementation that chooses to tighten the specification is surprising to users and somehow compromises the design of the C++ Standard. This claim is unfounded. If the difference causes confusion, clarity can and should be provided through consistency, QoI, documentation, and education. The Lakos Rule is highly motivated and straightforward to explain and understand. One of several ways to motivate it, ``so we can throw exceptions to test debug-mode asserts'', contains just ten words. We should not compromise the ability to test implementations on diverse platforms --- a real benefit that prevents bugs in production software --- for a perceived cleanliness of design.
Other ways to motivate the Lakos Rule are completely unrelated to negative testing and contract checking. A function having a wide contract that is known to never throw (given that it is implemented correctly) can be declared \tcode{noexcept}. In contrast, the behaviour of a function having a narrow contract is undefined when called out of contract; the C++ Standard does not place any restrictions on the behaviour in this case, including any restriction to not throw. We can therefore conclude that it is not logically sound to declare such a function \tcode{noexcept}: Doing so would implicitly define behaviour beyond the current domain of the function and, hence, that extended behaviour would no longer be undefined (according to the letter of the Standard). The concept of a narrow contract and that of a \tcode{noexcept} function therefore contradict each other (see \cite{P2861R0} for a detailed discussion).
This has direct consequences for software design. It follows that once a function having a narrow contract (for example, \tcode{std::span::operator[]}, which has undefined behaviour when called out of bounds), is declared \tcode{noexcept}, it can never be backward-compatibly extended to having a wide contract (for example, be made bounds-safe in a future version by throwing when called out of bounds) due to the implicitly defined behaviour of the \tcode{noexcept} specifier.
The C++ Standard Library should be an example of sound C++ library design. Abandoning the Lakos Rule would go directly against this goal.
\section{When should we use \tcode{noexcept}?}
\label{sec:noexcept}
\subsection{Code size and performance}
The Lakos Rule stipulates that functions having a narrow contract should not be declared \tcode{noexcept}, even if they are known to never throw an exception when called in contract. Part of the resistance to this rule is a widespread practice to declare as many functions as possible \tcode{noexcept}, often for no good reason.
In some cases, \tcode{noexcept} can measurably reduce the size of the generated binary code. Such a reduction might occur when the compiler cannot otherwise reason about the function not throwing (for example, because its definition is in another translation unit). In particular, when calling a non-\tcode{noexcept} function \tcode{f} from a \tcode{noexcept} function, the compiler has to ensure that \tcode{std::terminate} gets called when an exception gets thrown (and escapes the calling function). In general, that means that instead of \tcode{f()}, the compiler generates
\begin{codeblock}
try { f(); } catch ( ...) { std::terminate(); }
\end{codeblock}
On the other hand, when calling a \tcode{noexcept} function from another \tcode{noexcept} function, the compiler can emit just the function call (if we ignore inlining). In addition, for a \tcode{noexcept} function, the compiler does not have to generate unwind information because such a function never participates in unwinding.
Older platforms do exist, notably including 32-bit Windows, for which generating unwind information has a runtime cost (see \cite{TR18015} section 5.4, where such platforms are said to use the ``code'' approach). On most platforms however, generating unwind information happens at compile time (the ``table'' approach). This is also known as the \emph{zero-overhead exception model} and has become the de facto standard for essentially all modern 64-bit architectures (see \cite{Mortoray2013}).
On such platforms, the differences in codegen between \tcode{noexcept} and non-\tcode{noexcept} typically lead to no measurable (let alone significant) difference in runtime performance (for a detailed discussion, see ``Unrealizable runtime performance benefits'' within the ``\tcode{noexcept} Specifier'' section of \cite{EMC++S}). In fact, we are unaware of any study showing a measurable speedup in real-world code on any modern platform due to \tcode{noexcept}. Similarly, we are unaware of any study showing that exception-handling codegen has any penalty to compiler optimisations (as is sometimes claimed). \cite{Mahaffey2017} and \cite{Dekker2019} even found that \tcode{noexcept} can cause a net performance loss in certain cases. This loss is typically due to code motion across cache lines that can produce noise in either direction; this noise usually far outweighs any other impact of \tcode{noexcept} on performance.
More compact codegen can, of course, be a benefit in itself, even if there is no speedup whatsoever, particularly on embedded platforms where small binary size is an important concern. But on such platforms, C++ is typically compiled with exceptions disabled anyway, which removes any potential benefit from adding \tcode{noexcept} to function declarations.
Note also that in performance-critical code the affected functions in the hot path will typically be inlined. Even if exceptions are enabled, and if there are optimisations that the compiler can perform based on the function not throwing, it will be able to perform these optimisations anyway if the function is inlined, even if it is not declared \tcode{noexcept}.
\subsection{The actual use case for \tcode{noexcept}}
There is one genuine reason to declare a function \tcode{noexcept}: When a C++ program programmatically queries whether a function can throw, using the \tcode{noexcept} operator, and then chooses a different algorithm depending on the return value of that operator.
An example of an algorithm where such a query occurs is \tcode{std::vector::push_back}. Typically, in the presence of \tcode{noexcept}, copies will be turned into more efficient moves, which is both an observable change in behaviour and a measurable difference in performance. This is the original motivation for introducing \tcode{noexcept} in C++11 (see \cite{N2855} and \cite{N3050}), and its introduction is tightly linked to the introduction of move semantics. The functions being queried with the \tcode{noexcept} operator are nearly always copy, move, or swap operations. Hence, we would expect use of \tcode{noexcept} to be limited to copy and move constructors, copy and move assignment operators, and implementations of \tcode{swap}. Of these, only \tcode{swap} has a narrow contract (it requires equal allocators).
It follows that \tcode{swap} is the only bona fide exception to the Lakos Rule. We do not see a good reason to deviate from the Lakos Rule in any other cases, even in performance-sensitive code, unless a measurement can prove otherwise. This way, we can continue to enable usage of the effective exception-based negative testing strategy for the vast majority of functions having narrow contracts and require a fallback to death tests or other alternatives \emph{only} in those vanishingly few cases (e.g., copy, move, and swap) where there is a sound engineering reason to declare the function under test \tcode{noexcept}.
Looking beyond negative testing, we should make sure that exceptions continue to be well supported and optimised by the platforms and libraries we depend on. \tcode{noexcept} has a tendency to be overused, and if exceptions keep hitting arbitrary \tcode{noexcept} barriers, they are likely to rapidly reduce in usability.
\section{Can Contracts make the Lakos Rule obsolete?}
\label{sec:contracts}
SG21 is currently working on standardising a \emph{Contracts facility} --- i.e., a new language feature to be added to the C++ Standard --- that allows the user to express preconditions, postconditions, and assertions in C++ code. Having a language-based Contracts facility would have many advantages over current library-based approaches such as the \tcode{ASSERT} macro that we used in section \ref{sec:negativetest} above.
Attempts to standardise a Contracts facility have a long history. The design in \cite{P0542R5}, sometimes called ``C++20 Contracts'', almost made it into C++20 but was removed from the working draft at the last minute because of lack of consensus on some aspects of the design. After this failure to standardise Contracts for C++20, SG21 was established and is currently aiming to get a Contracts MVP into C++26. See \cite{P2695R1} for the current SG21 roadmap as well as \cite{P2521R3} and references therein for a summary of the current state of this effort.
The current Contracts MVP proposes two build modes: \emph{No_eval}, in which the precondition is ignored, and \emph{Eval_and_abort}, in which the precondition is checked; if the predicate evaluates to \tcode{false}, \tcode{std::terminate} is called. Note that such an MVP does not yet give us anything useful for the purposes of negative testing. Calling \tcode{std::vector::front} out of contract in \emph{No_eval} mode is not diagnosable at run time; in \mbox{\emph{Eval_and_abort}} mode, an out-of-contract call will result in \tcode{std::terminate} being called, which leaves death tests as the only method to write tests for such a call.
However, the Contracts MVP is a work in progress. SG21 is currently working on adding violation handling to the Contracts MVP. A recent proposal, \cite{P2811R3}, allows the user to install a custom violation handler at link time. Among other things, such a violation handler might be specified to throw an exception. This would give us a standard mechanism to perform exception-based negative testing.
While this would be a great outcome, note that according to all current proposals in this space (see \cite{P2698R0}, \cite{P2811R3}, and \cite{P2834R0}), neither a violation handler nor the contract-checking predicate itself should be allowed to throw through a \tcode{noexcept} boundary. An attempt to do so would call \tcode{std::terminate} as is the case today. On platforms where death tests are nonviable (see section \ref{subsec:deathtests}), the Lakos Rule will therefore still be required to conduct negative testing, even after adding a Contracts facility to the C++ Standard. We should therefore not remove the Lakos Rule as a design guideline for the C++ Standard Library.
\section{Conclusion}
Testing code for contract violations (negative testing) is an important part of keeping code quality high and reducing the number of introduced bugs. This approach is well proven in practice. Out of all implementation strategies for negative testing, we found that exception-based testing in combination with the Lakos Rule is the most straightforward, effective, and portable.
We have considered alternatives that do not require the Lakos Rule, such as a conditional \tcode{noexcept} macro, \tcode{setjmp} and \tcode{longjmp}, using child threads, signals, and three different flavours of death tests. All of them have unfavourable tradeoffs: they either do not scale due to an unacceptable performance overhead, are not implementable on all relevant platforms, or are outright incapable of providing the necessary functionality. In particular, the only alternatives to the Lakos Rule that seem to be somewhat viable are fork-based and clone-based death tests but \emph{only} for UNIX-like platforms (and at reduced efficiency); for other platforms, there are none.
Some C++ Standard Library implementations choose to flout the Lakos Rule and declare nonthrowing functions having narrow contracts \tcode{noexcept}. This practice is due to a combination of having to maintain backward compatibility, not caring about non-UNIX-like platforms (which means they can use death tests instead of exception-based tests, albeit at the price of higher complexity, worse performance, and other tradeoffs), or not caring about testing for contract violations at all. For these implementations, being unable to use exception-based testing is a choice they are free to make: replacing \emph{Throws: nothing} by \tcode{noexcept} is perfectly Standard-conforming, and they can continue to do so without changing the status quo.
Removing the Lakos Rule as a design guideline, however, would preclude the entire C++ community from using exception-based testing for Standard-conforming APIs. This regression would affect not only the major implementations of the C++ Standard Library, but also minor implementations, partial or modified implementations that are industry-specific or platform-specific, and the many non-Standard libraries that implement drop-in replacements with Standard-conforming APIs. Thus, removing the Lakos Rule would irreparably break existing testing strategies or make the affected APIs no longer Standard-conforming, while not providing any practical benefit to anyone. Bloomberg's BDE libraries are one well-known example of a codebase that would be negatively affected but are certainly not the only one: In this paper, we have shown case studies from two separate companies (unrelated to Bloomberg) that would suffer the same fate, and we are aware of others.
If we look beyond negative testing and consider the actual use case for \tcode{noexcept}, we arrive at the conclusion that specifying a function as \emph{Throws: nothing} and declaring it \tcode{noexcept} are conceptually different and serve entirely different purposes (specifying a narrow contract on the one hand and choosing the most efficient algorithm that uses a copy, move, or swap operation on the other hand). More broadly, from a software design perspective, the definition of a narrow contract and that of a \tcode{noexcept} function are fundamentally incompatible (the former specifies that for \emph{some} input, the behaviour is undefined, while the latter specifies that for \emph{all} input, the function is defined to not throw). This causes real issues with library design: declaring a function with a narrow contract \tcode{noexcept} makes it impossible to widen the contract later without breaking backward-compatibility. Removing the Lakos Rule would mean abandoning the idea that the C++ Standard Library should follow sound, consistent design principles.
We have also considered the ongoing work toward standardising a C++ Contracts facility. We conclude that Standard Contracts could become a powerful new tool for testing code but does not make the Lakos Rule any less necessary because a language-based Contracts facility will not change the fact that an exception cannot be thrown through a \tcode{noexcept} boundary.
The Lakos Rule is a long-standing design principle of the C++ Standard Library and is highly motivated and straightforward to explain and understand. Changing such an established principle requires reaching a high bar of justification. For all the reasons discussed in this paper, this bar for removing the Lakos Rule is clearly unmet. We therefore urge the C++ Standards Committee to maintain the status quo, that is, to retain the Lakos Rule.
%\section*{Document history}
%\begin{itemize}
%\item \textbf{R0}, 2023-03-08: Initial version.
%\item \textbf{R1}, 20XX-XX-XX: ??
%\end{itemize}
\section*{Acknowledgements}
We would like to thank Lori Hughes, John Lakos, and Mungo Gill for thoroughly reviewing this paper and providing useful feedback; to Ville Voutilainen for describing his idea of negative testing with stackful coroutines; and to Jonathan Wakely for providing additional useful comments.
\renewcommand{\bibname}{References}
\bibliographystyle{abstract}
\bibliography{ref}
\end{document}