-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CFS violates strict aliasing in SB and ES #2601
Comments
Thanks for the feedback. I've investigated the strict aliasing rules and I'm not convinced that anything actually violates them. I can't think of any specific case in the flow where type-punning (via pointer casting) is done. As noted the code makes use of union types but these are all designed to be compatible when overlaid (that is, the union contains a member of the actual type, and the internal data members are accessed through that type). The only thing I can think of here is related to the way that the pools are managed, in that they use Do you get any specific warnings or information from the compiler when you compile with |
Keep in mind the intent of the aliasing rules here. If there were no rules, then any data "write" could change the value of any other data. This means that the optimizer cannot hold the value of some object in e.g. a register - if any assignment can change any value, then it constantly has to actually [re-]read the value from memory every time it needs it. The aliasing rules aim to pose some limitations on that based on type - that is, knowing that only certain types of data writes can change a value, it can rely on cached copies of values (again, likely in a CPU register or similar) and not have to go to main memory as often. Notably -- the compiler/optimizer doesn't necessarily care how you came up with the address of some object in memory. The point is, once you have identified the object as a given type and started using it as that type, that you only access it through pointers of the same (or compatible) type for the lifetime of the object. I don't believe the pools are violating this, because although the pool memory was allocated as a |
I have not. I think currently analysis fails because of the amount of indirection and SB and ES being separate modules. I think the tooling in general is insufficient because this is a very difficult problem to handle.
In CFE_EVS_ProcessCommandPacket, in cfe_evs_dispatch.c:
However, that data is not accessed.
In CFE_EVS_EnableEvntTypeCmd in cfe_evs_task.c:
This is accessing the data in the uint8 array that's stored in an offset in the ES memory pool in SB. This doesn't obey common initial sequence bcasuse it comes from the BufDscPtr, which has SB_Buffer_t as its last member (and must because of the variable data which follows it, which is not described by a type other than the raw byte array).
I think using char is one necessary step, to keep using strict aliasing. I believe the other part is that for SB, buffers and buffer descriptions must be implemented to have separate queues and pools, so the common initial sequence isn't violated... but I will admit, the rules are fuzzy to me here, so that I'm not sure that approach is correct either.
I believe that, but I also believe that the current way type based aliasing analysis is handled by gcc is too aggressive. Linus Torvalds has this position, and it's why the linux, OpenBSD, and FreeBSD kernels are compiled with -fno-strict-aliasing.
I don't think that is correct. I believe the effective type is in fact a uint8 array, and that only pointers acquired via malloc (which itself cannot be implemented in strictly conforming C) may acquire an effective type using that kind of mechanism. |
Minimal example on godbolt where icc is available: https://godbolt.org/z/183oKP483 |
I tend to agree with gcc/clang here. I don't see this as violating the rules, at least in the code that is directly in the example. However, if one were to access the buffer directly (as uint8s) and also access it through the |
gcc actually does complain about it, when called with the flag Clang does not properly implement the strict aliasing warnings despite allowing the flags (https://github.com/llvm/llvm-project/blob/main/clang/test/Misc/warning-flags-tree.c#L16). It should also be noted that clang's user manual claims that -fno-strict-aliasing is the default (https://clang.llvm.org/docs/UsersManual.html), but I think that's probably a case of stale documentation. https://gustedt.wordpress.com/2016/08/17/effective-types-and-aliasing/ A member of WG14 states in this blog post that character arrays must not be reinterpreted as objects of other types. This confirms what I understand as the effective type of the memory pool in ES as being a uint8 array. https://godbolt.org/z/nhbqjjjP8 gcc with -Wstrict-aliasing=2 consistently warns that the pattern of casting the record's header to an arbitrary object type might break strict aliasing rules. It is also possible to blatantly violate aliasing without a warning being issued. https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-malloc-function-attribute Maybe using attribute malloc would be appropriate, since the way buffer descriptors work is analagous to the metadata prefixing pointers returned by malloc? Another approach is that CFE_SB_Buffer_t, or a wrapping type, could be a union of every message type that uses it, and that is the type stored in the memory pool. This would be difficult to maintain but technically correct and, as I understand it, not violating strict aliasing. There is a very significant downside to this approach in that memory use could explode. |
Note that the message generated by gcc is this:
With the operative word here being "might". They are just flagging something that could be wrong, but the analysis to determine if it is actually wrong or not must be done by the programmer, not the compiler. The reality is that all access to the header fields is still done through an object of type An important thing to note is that strict (pedantic?) adherence to some of these rules would result in highly inefficient and/or impossible to maintain code. There is constant friction between standards writers, compiler developers, and software developers to find a happy middle ground between stuff that's possible and stuff that's safe and stuff that's efficient. While one could argue that simply allocating a block of memory as |
I'll preface this by saying that I don't think any compiler today is doing anything to break the expectations of CFS as it is written, and that I doubt it will be a problem for some time. I will also say that I would prefer the standard to be written in such a way that it is unambiguous that static byte arrays may be treated as storage for objects. I am being pedantic because my perception is that compiler developers favor being as pedantic as possible, and sometimes even disregarding the standard in favor of optimizations:
I think this is true in SB and MSG inspecting a message returned by The problem with relying on code doing what has been observed, which violates rules, is that the behavior can be legally changed by future compiler versions. |
I don't think so. Such a change would break every call to e.g. This is actually part of the reason for using C in the first place, is that it gives you low(-ish) level control over how and where your objects appear in memory. If they broke that for whatever reason, they'd have to give the resulting language a new name, like Rust or something. It wouldn't be C anymore. |
Functions like
I appreciate the joke - and I'll take the opportunity to mention that rust is currently a no-go for CFS and flight software in general because of a lack of first class support for platforms that aren't Linux, OSX, or Windows on arches that aren't x86 or ARM. There's also problems about how panics work and some other concerns, but suffice it to say that it's a major engineering effort to make rust amenable to flight software environments. |
GCC and Clang by default enable optimizations for strict
aliasing. This is a problem for CFS users because the way that CFS is
architected violates strict aliasing rules - for example, in SB's use
of ES memory pools. GCC and clang can be tuned with the
-fno-strict-aliasing
flag to prevent this from causing undefinedbehavior.
Data Structures
For the sake of clarity, I will describe 2 data structures used in my
examples below here:
CFE_SB_Buffer_t is a union of CFE_MSG_Message_t and the types with the
largest alignment requirements for the platform. CFE_MSG_Message_t is
a union of CCSDS_SpacePacket_t and a byte array for its serialized
representation.
I think it is pertinent to note that
CFE_SB_Buffer_t Content
is thelast field of CFE_SB_BufferD_t, since that may have bearing on the
common initial sequence rule of strict aliasing.
An example using
CFE_SB_ReceiveBuffer
To see why this data structure is a problem with strict aliasing, I'll
use CFE_SB_ReceiveBuffer as an example:
CFE_SB_ReceiveBuffer takes a pointer to a pointer to CFE_SB_Buffer_t,
BufPtr
, whose value is*BufPtr
, and a queue identifier,PipeId
.CFE_SB_ReceiveBuffer retrieves a buffer from the OSAL queue associated
with
PipeId
, callingOS_QueueGet
withCFE_SB_BufferD_t BufDscPtr
as its
data
parameter.OS_QueueGet
internally usesOS_QueueGet_Impl
- for examle, theposix version in file
osal/src/os/posix/src/os-impl-queues.c
.OS_QueueGet_Impl
callsmq_receive
, usingdata
(CFE_SB_BufferD_t BufDscPtr
). Now we mustexamine whence comes
data
, viaCFE_SB_TransmitMsg
.How Messages are stored in SB
CFE_SB_TransmitMsg
is called with a pointer to a structure whosefirst member is
CFE_MSG_Message_t
- i.e. a struct whose first memberis either
CFE_MSG_CommandHeader_t
orCFE_MSG_TelemetryHeader_t
. Encoded in this data is the total lengthof the message, which extends beyond
CFE_MSG_Message_t
. In order tosend this message,
SB
must acquire a large enough buffer from thememory pool using
CFE_SB_GetBufferFromPool
.CFE_SB_GetBufferFromPool
callsCFE_ES_GetPoolBuf
. The memory inthis pool ultimately derives from a global variable
CFE_SB_Global_t CFE_SB_Global
, in itsCFE_SB_MemParams_t Mem
member,in that member's
CFE_ES_STATIC_POOL_TYPE(CFE_PLATFORM_SB_BUF_MEMORY_BYTES) Partition
member.
CFE_ES_STATIC_POOL_TYPE
is a macro defined incfe_es_api_typedefs.h
which is a union of array ofuint8
andCFE_ES_PoolAlign_t
, which is a union ofvoid*
,long long int
,and
long double
.All this to say that the backing buffer is a byte array. I should at
this point stress that by the standard,
uint8
andchar
are notequivalent - and
unsigned char
andsigned char
are BOTH distinctfrom
char
. This has important ramifications for the particulars ofthe strict aliasing rule.
Once
CFE_SB_TransmitMsg
acquires its backing buffer from the memorypool, it copies the message passed to it into a
BufDscPtr
, and callsCFE_SB_BroadcastBufferToRoute
with thatBufDscPtr
.CFE_SB_BroadcastBufferToRoute
in turn ultimately callsOS_QueuePut
with that data.OS_QueuePut
callsOS_QueuePut_Impl
,which calls
mq_timedsend
- with the memory pool data that is a bytearray.
Violation of Strict Aliasing
CFE_SB_ReceiveBuffer assigns to
*BufPtr
the address ofBufDscPtr->Content.
, which has been obtained fromOS_QueueGet
, andresides in
SB
's memory pool as a portion of a byte array. When thecaller gets this data, it's represented as a pointer to a
CFE_SB_Buffer_t - but really is an arbitrary location in a buffer
array in
SB
. There is a subtle problem with this even before thecaller casts it to another type: it already violates strict aliasing.
Per C99 (WG14/N1256, ISO/IEC 9899:TC3, September 7, 2007), section 6.5.7:
As mentioned previously, a
uint8
array (unioned withvoid*
andlargest system types) is not a
CFE_SB_Buffer_t
or aCFE_MSG_Message_t
, and is notchar
. The question becomes, is itacceptable when the primary header is defined as a byte array below
its type as a structure? I don't believe this is allowable because it
would subvert the type system too much.
It would be completely fine to only use these buffers to transmit and
receive data into local copies - but there are also "zero-copy"
versions of these functions whose use would be rendered
useless by that paradigm, and incur a performance penalty for
additional copies. It's frequently noted in discussions related to
strict aliasing that when the compiler sees a memcpy where type
punning is desired, it will eliminate that: I doubt that optimization
is easy to perform with the level of indirections present in CFS and
OSAL.
A problem indisputably arises where the
CFE_SB_Buffer
(with data)stored in the memory pool is accessed via a pointer cast to another
type - e.g. some command structure with additional payload data.
Final observations and conclusions.
SB is not the sole user of ES Memory Pools, and the pattern of using
byte arrays for various types is a common pattern in flight software
in the CFS ecosystem. I think it is prudent to enable the
-fno-strict-aliasing
flag in all CFS modules, OSAL, PSP, and allapps, to mitigate undefined behavior arising from violating type based
aliasing.
Additional Reading
The Strict Aliasing Situation is Pretty Bad
What is the Strict Aliasing Rule and Why do we care?
gcc may_alias attribute
The text was updated successfully, but these errors were encountered: