Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix data alignment issues found by UBSan #554

Merged
merged 4 commits into from
Nov 8, 2019

Conversation

ilammy
Copy link
Collaborator

@ilammy ilammy commented Nov 8, 2019

PR #548 has introduced a blacklist for some of the issues found by undefined behavior sanitizer. We have silenced the warnings until we resolved them, and disabled UBSan runs for GCC completely. While that PR has been cooking, I have found a proper way to fix the warnings. That allows us to remove the blacklist and mark the remaining warnings in a way supported by GCC.

Make soter_container_hdr_t packed

Many places in our code simply cast arbitrary byte arrays to Soter header structure and expect that to work. Type-punning misaligned data is undefined behavior (and may cause crashes on some ARM systems).

Set soter_container_hdr_t to be 1-byte aligned (usually it's expected to have 4-byte alignment). This fixes the inconsistency and forces the compiler to generate safe code. On x86 this does not really change anything since it allows misaligned loads, but ARM code now behaves differently. See for yourself in Godbolt.

Revert "Blacklist some UBSan warnings"

We can also enable UBSan for GCC now.

Mark Ed25519 functions for UBSan

These functions are written by level 80 crypto C gurus so they are as free from undefined behavior as Chuck Norris. Their behavior is the behavior. By definition. Daniel J. Bernstein is never wrong, except the times when he is.

There are caveats (see commit message for details), but they are acceptable given the architectures we support.

Avoid misaligned writes

There were some stray misaligned writes besides from soter_container_hdr_t. Fix them.

Checklist

  • Change is covered by automated tests (we-e-ell, we don’t check ARM, but...)
  • Benchmark results are attached (not applicable)
  • The coding guidelines are followed
  • Public API has proper documentation (not needed, this is a private thing)
  • Example projects and code samples are updated if needed (no API changes)
  • Changelog is updated if needed (nothing interesting for the users)

Many places in our code simply cast arbitrary byte arrays to Soter
header structure and expect that to work. Type-punning misaligned data
is undefined behavior (and may cause crashes on some ARM systems).

Set soter_container_hdr_t to be 1-byte aligned (usually it's expected
to have 4-byte alignment). This fixes the inconsistency and forces the
compiler to generate safe code. On x86 this does not really change
anything since it allows misaligned loads, but ARM code now behaves
differently. See for yourself in Godbolt.

There is no portable way to set data structure alignment nicely. GCC
uses __attribute__((packed)) and Clang support that as well. However,
MSVC does not support that and uses a special pragma. Both GCC and
Clang support #pragma pack, so let's roll with that.
This reverts commit 0d97f5b (partially)

The blacklist has been introduced because there were *too many*
functions using soter_container_hdr_t which had incorrect alignment.
Since now we use correct alignment for Soter containers, we can
enable Undefined Behavior sanitizer back and remove the blacklist.
We can also enable UBSan for GCC now.

Note that Ed25519 implementation still causes UBSan warnings. We'll
deal with that separately.
These functions are written by level 80 crypto C gurus so they are as
free from undefined behavior as Chuck Norris. Their behavior is *the*
behavior. By definition. Daniel J. Bernstein is never wrong, except
the times when he is.

Mark all of them with a special attribute that ignores certain warnings
from Undefined Behavior Sanitizer. Almost all of them are actually not
undefined behavior (but often implementation-defined behavior). The
only exception is "shift" which warns about left shift of negative
signed values, which is genuine undefined behavior (by C standard).
Some places here like to shift "-1" values as signed integers. However,
on x86 and ARM this particular behavior is probably okay, so we'll let
that slide. (The side effect here is that carry flag may get undefined
value and it may affect future conditionals in the code.)
Here instead of converting uint8_t* into a uint64_t* which has different
alignment we should use memmove() in order to copy bytes safely. (Don't
worry, it will be inlined.)
@ilammy ilammy added the core Themis Core written in C, its packages label Nov 8, 2019
@vixentael
Copy link
Contributor

Their behavior is the behavior. By definition. Daniel J. Bernstein is never wrong, except the times when he is.

so true.

but ARM code now behaves differently

let's look precisely on mobile tests after merging this PR. i'm 99% sure that nothing will break, but just in case

Copy link
Collaborator

@Lagovas Lagovas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ilammy ilammy merged commit 93529d9 into cossacklabs:master Nov 8, 2019
@ilammy ilammy deleted the alignment branch November 8, 2019 19:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Themis Core written in C, its packages
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants