-
Notifications
You must be signed in to change notification settings - Fork 5
/
Copy pathp2679.tex
186 lines (131 loc) · 14.5 KB
/
p2679.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
\input{wg21common}
\usepackage{listings}
\lstset{
escapeinside={(*@}{@*)}, % if you want to add LaTeX within your code
}
\newcommand{\forceindent}{\parindent=1em\indent\parindent=0pt\relax} % For indenting a paragraph containing code that can't be laid out as a {codeblock} because it also contains \emph
\begin{document}
\title{Fixing \mbox{\tcode{std::start_lifetime_as}} and \mbox{\tcode{std::start_lifetime_as_array}}}
\author{
Timur Doumler \small(\href{mailto:[email protected]}{[email protected]}) \\
Arthur O'Dwyer \small(\href{mailto:[email protected]}{[email protected]}) \\
Richard Smith \small(\href{mailto:[email protected]}{[email protected]}) \\
Alisdair Meredith \small(\href{mailto:[email protected]}{[email protected]}) \\
Robert Leahy \small(\href{mailto:[email protected]}{[email protected]})
}
\date{}
\maketitle
\begin{tabular}{ll}
Document \#: & P2679R2 \\
Date: & 2023-02-08\\
Project: & Programming Language C++ \\
Audience: & Core Working Group, Library Working Group
\end{tabular}
\begin{abstract}
\mbox{\tcode{std::start_lifetime_as}} and \mbox{\tcode{std::start_lifetime_as_array}}, facilities to explicitly start the lifetime of an object of implicit-lifetime type inside a block of suitably aligned storage, was introduced in \cite{P2590R2} and voted into C++23. We propose to fix some remaining issues before C++23 ships. We also discuss possible changes to the API that were considered but rejected.
\end{abstract}
\section{Changes proposed}
\subsection{Allow size \tcode{0} for arrays of unknown bound}
\mbox{\tcode{std::start_lifetime_as}} currently does not work with the value \tcode{0} for its second parameter (the size of the dynamic array to be created). Passing \tcode{0} is undefined behaviour. This API makes it unnecessarily difficult and error-prone for generic code to interact with it. We propose to allow the value \tcode{0}. This proposal resolves NB comment CA-086 targeting C++23. For the size \tcode{0} case, the pointer returned can only be used in the manner in which pointers past the end of an array can be used; in particular, the pointer cannot be dereferenced. In generic code, this restriction is typically fine, and allowing size \tcode{0} obviates the need to special-case size \tcode{0}:
\begin{codeblock}
void processData(unsigned char* dataFromNetwork, size_t numObjectsToRead) {
Data* data = std::start_lifetime_as_array<Data>(dataFromNetwork, numObjectsToRead);
for (size_t i = 0; i < numObjectsToRead; ++i)
processObject(data[i]);
}
\end{codeblock}
For the size \tcode{0} case, we further propose to allow \tcode{nullptr} as the value of the first parameter, as no storage is actually needed in this case:
\begin{codeblock}
Data* data = std::start_lifetime_as_array<Data>(nullptr, 0); // OK
assert(data == nullptr); // holds
\end{codeblock}
\subsection{Make using wrong overload set ill-formed}
\mbox{\tcode{std::start_lifetime_as}} creates objects of non-array type and arrays of known bound (static arrays); \mbox{\tcode{std::start_lifetime_as_array}} creates arrays of unknown bound (dynamic arrays). Using the wrong interface with the wrong type should be ill-formed:
\begin{codeblock}
unsigned char* buf = /* ... */
auto* p1 = std::start_lifetime_as<int[]>(buf); // UB, should be ill-formed
\end{codeblock}
With the current wording, the code above is undefined behaviour as per [res.on.functions] p2.5. The \emph{Mandates} clause needs to be adjusted to make this code ill-formed instead.
\section{Changes considered (not proposed)}
In the previous revision \cite{P2679R0} of this paper, and in NB comment GB-087, we had proposed further changes to the API of \mbox{\tcode{std::start_lifetime_as}} and \mbox{\tcode{std::start_lifetime_as_array}}. We felt that the API was inconsistent with other APIs in the standard library that create objects and accept both array and non-array types, such as \tcode{std::make_shared} and \tcode{std::make_unique}. These have a version for non-array types, a version for array types of known bound, and a version for array types of unknown bound, respectively, all with the same name, whereas here, \mbox{\tcode{std::start_lifetime_as}} handles non-arrays and static arrays, and a version with a \emph{different name}, \mbox{\tcode{std::start_lifetime_as_array}}, handles dynamic arrays. In other words, static arrays are handled with a function that does \emph{not} have the \tcode{_array} suffix in the name.
Further, the template parameters seem inconsistent between the two overload sets. On the one hand, for \mbox{\tcode{std::start_lifetime_as}}, when used with an array type \tcode{U[N]} of known bound, the template argument that the user needs to provide is the type \tcode{U[N]} of the object being created (for example, \tcode{std::start_lifetime_as<int[16]>}). On the other hand, for \mbox{\tcode{std::start_lifetime_as_array}}, the template argument is not the type \tcode{U[]} of the object being created, but instead the type of its elements \tcode{U}.
Finally, the overloads for arrays of \emph{unknown} bound (the ones with the suffix \tcode{_array}) return a pointer to the first element of the array, while the overloads without the suffix \tcode{_array}, when used with an array type of known bound, return a pointer to the array itself. In other words, a call to \tcode{std::start_lifetime_as_array<int>(p, 16)} will return an \tcode{int*}, but at the same time a call to \tcode{std::start_lifetime_as<int[16]>(p)} will return an \tcode{int(*)[16]}.
We therefore proposed in \cite{P2679R0} to redesign this API such that all overloads have the same name \mbox{\tcode{std::start_lifetime_as}} (dropping the \tcode{_array} suffix for dynamic arrays), the template argument is always the type of the object created, and the return type is either a pointer to the object created (for non-arrays) or a pointer to the first element of the array (for arrays):
\begin{codeblock}
unsigned char* buf = /* ... */
// Proposed in P2679R0:
int* p1 = std::start_lifetime_as<int>(buf);
int* p2 = std::start_lifetime_as<int[10]>(buf);
int* p3 = std::start_lifetime_as<int[]>(buf, 10);
\end{codeblock}
LEWG has pointed out that returning a pointer to the first element would mean that for arrays there will be no way to obtain a pointer to the array itself, since arrays are not pointer-interconvertible with their first element ([basic.compound] p4). In the case of a static array, this design would mean that the compile-time information about the size of the array, which is embedded in the type, would be lost. LEWG asked to revise the paper such that for arrays, \mbox{\tcode{std::start_lifetime_as}} returns a pointer to the created array. For static arrays, this design retains the possibility to pass the created array by reference, preserving the size information in the type:
\pagebreak % MANUAL
\begin{codeblock}
void processBlock(Data (&block)[8]);
void doStuff() {
// ...
processBlock(*std::start_lifetime_as<Data[8]>(dataFromNetwork));
}
\end{codeblock}
This design would also have made the overload set even more consistent, because then, for all possible types, \mbox{\tcode{std::start_lifetime_as}} would always return a pointer to the created object.
This design then went to CWG. They pointed out that while returning a pointer to a \emph{static} array is useful for use cases like the above, we should never be returning a pointer to a \emph{dynamic} array from a standard library function. Every existing standard library API that creates a dynamic array (\emph{new-expression}, \tcode{operator new}, \tcode{std::make_shared}, \tcode{std::make_unique}, etc.) only ever returns a pointer to the first element, with no way to obtain a pointer to the whole array. Making returning a pointer to the array work in the dynamic array case would require major heroics in the wording, particularly for the size \tcode{0} case, as we do not have a well-defined concept of zero-length arrays (or pointers to them) in the language today. Notably, a dynamic array \tcode{T[]} is an incomplete type, so you cannot ever create an object of such a type. But you can also not create an object of type \tcode{T[n]}, because for dynamic arrays, \tcode{n} has a runtime value unknown to the type system. Therefore, the type of the object actually created falls outside of the C++ type system. We currently get around this problem in the wording by not specifying the type of the object created, but only describing the type in quotes (``array of \tcode{n} \tcode{T}'') and never giving the user any way to actually refer to such an object (such as through a pointer or a reference).
Unlike for static arrays, a pointer to a dynamic array is also not particularly useful, as it does not contain any information about the size in its type. The only thing you could ever do with such a pointer is to dereference it and let it decay to a pointer to the first element. It is therefore more user-friendly to just give them a pointer to the first element.
This line of reasoning made us realise that \mbox{\tcode{std::start_lifetime_as}} (for non-arrays and for static arrays) and \mbox{\tcode{std::start_lifetime_as_array}} (for dynamic arrays) are two fundamentally different APIs. In the former, you specify the type of the object to be created, and get a pointer to this object; in the latter, you specify the element type and the desired array size, and get a pointer to the first element of the created array. Both APIs do the correct thing for the types they are meant for. They are also sufficiently different that they should have a different name, which justifies the \tcode{_array} suffix for the dynamic array version. The API currently in the C++23 draft is therefore the correct one (with the additions proposed in this paper).
For maximum clarity, the overload set for dynamic arrays should perhaps have been named \mbox{\tcode{std::start_lifetime_as_dynamic_array}} instead, but this name is uncomfortably long. LEWG argued further that this API is a low-level facility targeted at expert users who would be able to figure out the difference, particularly with the (still proposed) change to the \emph{Mandates} clause to make usage of either API with the wrong kind of array type ill-formed. In the end, LEWG decided to leave the basic design of the API unchanged and reject NB comment GB-087.
Each version still needs four overloads: for non-\tcode{const}, \tcode{const}, \tcode{volatile}, and \tcode{const volatile} pointers, respectively. In an unrelated earlier design consideration, we looked at how we could reduce the amount of overloads. We considered an alternative approach for this API: have only one overload per version (instead of four), but restrict the parameter to \tcode{std::is_void}. However, that approach would forbid conversions, rendering the API difficult to use: in practice, the argument is rarely a \tcode{void*}, but typically a pointer to storage such as an \tcode{unsigned char*} or a \tcode{std::byte*}. We therefore rejected such an approach as unviable.
\pagebreak % MANUAL
\section{Proposed wording}
\label{sec:wording}
The proposed changes are relative to the C++ working paper \cite{N4917}.
Modify [obj.lifetime] as follows:
\textbf{Explicit lifetime management \hspace{83mm}[obj.lifetime]}
\begin{codeblock}
template<class T>
T* start_lifetime_as(void* p) noexcept;
template<class T>
const T* start_lifetime_as(const void* p) noexcept;
template<class T>
volatile T* start_lifetime_as(volatile void* p) noexcept;
template<class T>
const volatile T* start_lifetime_as(const volatile void* p) noexcept;
\end{codeblock}
\begin{adjustwidth}{0.5cm}{0.5cm}
\emph{Mandates:} \tcode{T} is an implicit-lifetime type\added{ and not an incomplete type}.
\emph{Preconditions:} [\tcode{(char*)p}, \tcode{(char*)p + sizeof(T)}) denotes a region of allocated storage that is a subset of the region of storage reachable through ([basic.compound]) \tcode{p} and suitably aligned for the type \tcode{T}.
\emph{Effects:} Implicitly creates objects ([intro.object]) within the denoted region as follows: an object $a$ of type \tcode{T}, whose address is \tcode{p}, and objects nested within $a$. The object representation of $a$ is the contents of the storage prior to the call to \tcode{start_lifetime_as}. The value of each created object $o$ of trivially-copyable type \tcode{U} is determined in the same manner as for a call to \tcode{bit_cast<U>(E)} ([bit.cast]), where \tcode{E} is an lvalue of type \tcode{U} denoting $o$, except that the storage is not accessed. The value of any other created object is unspecified. \begin{note}The unspecified value can be indeterminate.\end{note}
\emph{Returns:} A pointer to $a$.
\end{adjustwidth}
\begin{codeblock}
template<class T>
T* start_lifetime_as_array(void* p, size_t n) noexcept;
template<class T>
const T* start_lifetime_as_array(const void* p, size_t n) noexcept;
template<class T>
volatile T* start_lifetime_as_array(volatile void* p, size_t n) noexcept;
template<class T>
const volatile T* start_lifetime_as_array(const volatile void* p, size_t n) noexcept;
\end{codeblock}
\begin{adjustwidth}{0.5cm}{0.5cm}
\added{\emph{Mandates:} \tcode{T} is a complete type.}
\emph{Preconditions:} \removed{\tcode{n > 0} is \tcode{true}.}\added{\tcode{p} is suitably aligned for an array of \tcode{T} or is null.
\tcode{n <= size}_\tcode{t(-1)}
\tcode{ / sizeof(T)} is \tcode{true}. If \tcode{n > 0} is \tcode{true}, [\tcode{(char*)p}, \tcode{(char*)p + (n * sizeof(T))}) denotes a region of allocated storage that is a subset of the region of storage reachable through ([basic.compound]) \tcode{p}.}
\emph{Effects:} \added{If \tcode{n > 0} is \tcode{true}, e}\removed{E}quivalent to\removed{:} \removed{\tcode{return *}}\tcode{start}_\tcode{lifetime}_\tcode{as<U>(p)}\removed{\tcode{;}} where \tcode{U} is the type ``array of \tcode{n} \tcode{T}''.\added{ Otherwise, there are no effects.}
\added{\emph{Returns: }A pointer to the first element of the created array, if any; otherwise, a pointer that compares equal to \tcode{p} ([expr.eq]).}
%\added{\emph{Returns:} A pointer to $a$.}
\end{adjustwidth}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\pagebreak
\section*{Document history}
\begin{itemize}
\item \textbf{R0}, 2022-10-15: Initial version.
\item \textbf{R1}, 2022-11-11: Removed API redesign requested in NB comment GB-087 as it was rejected by LEWG; added support for \tcode{n == 0} case resolving NB comment CA-086.
\item \textbf{R2}, 2022-02-08: Minor wording fix.
\end{itemize}
\section*{Acknowledgements}
Many thanks to Daniel Kr\" ugler, Peter Dimov, and Ville Voutilainen for their valuable comments.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\renewcommand{\bibname}{References}
\bibliographystyle{abstract}
\bibliography{ref}
\end{document}