-
Notifications
You must be signed in to change notification settings - Fork 185
Proposed extension changes to improve backward compatibility
One of the benefits of C is that C compilers are available for almost all platforms. You can write portable C code that can run almost anywhere. This is important for open-source code bases.
Because of this, some Checked C users want to compile their Checked C code with exisitng compilers that do not understand Checked C. They will have the benefits of Checked C when using a compiler that understands the extension. Their code will run elsewhere but be less secure.
We propose a set of changes to Checked C to allow the C preprocessor to be used to remove Checked C extensions. You would modify your C files to include a header file that results in Checked C annotations and changes being removed when the compiler does not support Checked C. In this design, we will name the header file optcheckedc.h
. Another name would fine.
The Checked C extension has 3 new pointer types: _Array_ptr
, _Nt_array_ptr
, and
_Ptr
. It also adds checked array types. The pointer types use syntax based on C++ templates
because is is easier to understand.
_Array_ptr<int> p = 0;
declares a pointer to an array of integers where accesses via the pointer are bounds checked.
C has type qualifiers that change the meaning of a type: _Atomic
, const
, volatile
,
and restrict
. This can be applied to a pointer type by following the *
with
a list of qualifiers:
int *p const = ...
We propose adding 3 new qualifiers for pointer types:
_Array _Nt_array _Single
These provide an alternate way of declaring the new pointer types:
int *p _Array = ...
declares an array_ptr to an integer. It is equivalent to
_Array_ptr<int> p = ...
The idea of using type qualifiers this way was first proposed by the Deputy system from Berkeley (Jeremy Condit et al., Dependent Types for Low-Level Programming, ESOP '07)
In the case where a C compiler does not support Checked C, the type qualifiers are
#defined
to an empty string:
#define _Array
#define _Nt_array
#define _Single
It is easy to handle checked array types. They are marked by prefixing the dimension with he keyword _Checked
example int arr _Checked[5]
.
In the header file, when a compiler does not support Checked C, we #define
the keyword _Checked
to the empty string:
#define _Checked
This will also eliminate the _Checked
keyword used for _Checked blocks.
For generic type applications, we will use a macro to wrap the type arguments. It will discard and the type arguments when Checked C is not available:
#define _TyArg(...)
and maps it to the Checked C syntax when Checked C is available
#define _TyArg(...) <__VA_ARGS__>
As an example,
malloc<int>(e)
becomes
malloc _TyArg(int) (e)
We propose a two-part approach for generic function definitions and generic struct definitions.
These bind type variables that can appear in code. When Checked C is not available,
_For_any
can be a macro that discards its arguments:
#define _For_any(...)
The programmer specifies each type variable T
bound by the
_For_any
construct as _TyVar(T)
. When Checked C is not available, _TyVar(T)
can be a macro that is defined as:
#define _TyVar(T) void
As an example, when Checked C is not available,
_For_any(T) f(_TyVar(T) * _Single) { ... }
becomes
f(void *) { ... }
The same approaches can be used with _Itype_for_any
.
Two examples of generic function declarations with bounds-safe interfaces are given below with both the current and the new syntax:
Example 1:
Current syntax:
_Itype_for_any(T) void *malloc(size_t size) : itype(_Array_ptr<T>) byte_count(size);
New syntax:
_IType_for_any(T) void *malloc(size_t size) _IType(_TyVar(T) * _Array) _Byte_count(size);
Example 2:
Current syntax:
_Itype_for_any(T) void *bsearch(const void *key : itype(_Ptr<const T>),
const void *base : itype(_Array_ptr<const T>) byte_count(nmemb * size),
size_t nmemb, size_t size,
int ((*compar)(const void *, const void *)) :
itype(_Ptr<int(_Ptr<const T>, _Ptr<const T>)>)
) : itype(_Ptr<T>);
New syntax:
_IType_for_any(T) void *bsearch(const void *key _IType(_TyVar(T) * _Single const),
const void *base _IType(_TyVar(T) * _Array const) _Byte_count(nmemb * size),
size_t nmemb, size_t size,
int ((*compar)(const void *, const void *))
_IType(int (*)(_TyVar(T) * _Single const, _TyVar(T) * _Single const) _Single)
) _IType(_TyVar(T) * _Single);
- To erase an existential type, we define a macro
#define Exists(T, inner_type) inner_type
. Note that we already have the macro#define _TyArg (T)
. - With the above definitions, the macro expansion of
Exists(T, struct cb_info _TyArg(T))
results instruct cb_info
. - Next, we define a macro
#define _Pack(expr, exist_type, subst_type) expr
- With this definition,
_Pack(info, _Exists(A, struct cb_info _TyArg(A)), T)
expands toinfo
. - We define
#define _Unpack(T)
.
In the example below we show how the existential types are removed when Checked C is not available, using the above macro definitions.
#define _TyArg(T)
#define _Exists(T, inner_type) inner_type
#define _Pack(expr, exist_type, subst_type) expr
#define _For_any(T)
#define _Unpack(T)
#define _TyVar(T) void
#define event_id int
#define MAX_CB 10
struct cb_info _For_any(T) {
_TyVar(T) *data;
void (*cb)(event_id id, _TyVar(T) *data);
};
struct map_entry {
event_id id;
_Exists(T, struct cb_info _TyArg(T)) info;
};
int num_entries = 0;
struct map_entry cb_map[MAX_CB];
_For_any(T) void reg_cb(event_id id, struct cb_info _TyArg(T) info) {
_Exists(A, struct cb_info _TyArg(A)) opaque_info = _Pack(info, _Exists(A, struct cb_info _TyArg(A)), T);
struct map_entry entry;
entry.id = id;
entry.info = opaque_info;
cb_map[num_entries] = entry;
num_entries += 1;
}
void handle(event_id id) {
for (int i = 0; i < num_entries; i++) {
if (cb_map[i].id == id) {
_Unpack (T) struct cb_info _TyArg(T) cb_inf = cb_map[i].info;
cb_inf.cb(id, cb_inf.data);
}
}
}
becomes
struct cb_info {
void *data;
void (*cb)(int id, void *data);
};
struct map_entry {
int id;
struct cb_info info;
};
int num_entries = 0;
struct map_entry cb_map[10];
void reg_cb(int id, struct cb_info info) {
struct cb_info opaque_info = info;
struct map_entry entry;
entry.id = id;
entry.info = opaque_info;
cb_map[num_entries] = entry;
num_entries += 1;
}
void handle(int id) {
for (int i = 0; i < num_entries; i++) {
if (cb_map[i].id == id) {
struct cb_info cb_inf = cb_map[i].info;
cb_inf.cb(id, cb_inf.data);
}
}
}
Checked C has bounds annotations that are added to declarations using :
bounds-exp, where bounds-exp
can be count(e1)
, bounds(e1, e2)
, byte_count(e1)
, or bounds(any)
. The :
lets the parser know
that certain identifiers have a context-sensitive meaning, so that we do not have to use an underscore to prefix
the name.
To avoid the need for the :
, we can introduce four new keywords: _Any
, _Bounds
, _Byte_count
, and _Count
. These
can be the names of bounds expressions. We can then allow a declarator to be followed by a bounds expression, instead
of requiring a :
.
When Checked C is not supported, we can introduce macros that discard the keywords:
#define _Any
#define _Bounds(e1, e2)
#define _Byte_count(e1)
#define _Count(e1)
We can introduce a new keyword _Itype
for itype annotations also.
For dynamic check expressions, when a compiler does not support Checked C, we can have a macro that evaluates the argument but does nothing with it.
#define _Dynamic_check(e1) ((void)(e1))
An alternative is to make use of the assert
macro:
#define _Dynamic_check(e1) ((e1) ? (void)0 : assert(0))
_Dynamic_bounds_cast
and _Assume_bounds_cast
expressions have the form op<T>(e, bounds-expression)
.
The type argument is a problem a problem for non-Checked C compiler. We can introduce variadic macros that
take 2 or 3 arguments. For now, we proposed just using an _M
as a suffix to indicate that this is a macro.
We are open to other suggestions.
_Dynamic_bounds_cast_M(T, e1, ...)
_Assume_bounds_cast_M(T, e1, ...)
When Checked C is supported, they would map to their corresponding syntax forms:
#define _Dymamic_bounds_cast_M(T, e1, ... ) _Dynamic_bounds_cast<T>(e1, __VA_ARGS__)
#define _Assume_bounds_cast_M(T, e1, ...) _Assume_bound_cast<T>(e1, __VA_ARGS__)
When it is not supported, they would become a C-style cast:
#define _Dynamic_bounds_M(T, e1, ...) ((T) e1)
#define _Assume_bounds_cast_M(T, e1, ...) ((T) e1)
For where clauses, we can allow optional parentheses to surround the where clause:
_Where (e1 _And _e2)
When a compiler does not support Checked C, we can have macro that discards the body of the where clause:
#define _Where(e)
Checked C Wiki