Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

range-based constant-flow base64 #5108

Merged
merged 21 commits into from
Oct 27, 2021

Conversation

gilles-peskine-arm
Copy link
Contributor

@gilles-peskine-arm gilles-peskine-arm commented Oct 25, 2021

Fix #4814: in Mbed TLS 2.26.0 and 2.16.10, we made the base64 code constant-flow by changing table lookup into constant-time table lookup (look up every item and or them together). This had a significant performance cost. This pull request uses a different approach: instead of doing a constant-flow table lookup, do a range-based or. Since base64 has only 5 ranges, as opposed to 64/128 table entries (encoding/decoding), this turns out to be a significant performance improvement. There is also a slight gain in code size (but still a loss compared to the original non-constant-time table approach). See #4819 for benchmarks.

This is a mostly straightforward rebase of the 2.2x version. A single reviewer may be enough, I'll leave that at the discretion of the first reviewer. Rebase notes:

  • "New sample program to benchmark certificate loading"
    • Include "mbedtls/build_info.h" insted of MBEDTLS_CONFIG_FILE.
    • No Visual Studio files in development.
    • programs/Makefile: the APPS list is no longer defined with extensions.
  • "Move declarations of testing-only base64 functions to their own header"
    • No Visual Studio files in development.

Backports: 2.2x, 2.16.

Base64 decoding uses equality comparison tests for characters that don't
leak information about the content of the data other than its length, such
as whitespace. Do this with '=' as well, since it only reveals information
about the length. This way the table lookup can focus on character validity
and decoding value.

Signed-off-by: Gilles Peskine <[email protected]>
Instead of doing constant-flow table lookup, which requires 128 memory loads
for each lookup into a 128-entry table, do a range-based calculation, which
requires more CPU instructions per range but there are only 5 ranges.

Experimentally, this is ~12x faster on my PC (based on
programs/x509/load_roots). The code is slightly smaller, too.

Signed-off-by: Gilles Peskine <[email protected]>
Document what each local variable does when it isn't obvious from the name.
Don't reuse a variable for different purposes.

This commit has very little impact on the generated code (same code size on
a sample Thumb build), although it does fix a theoretical bug that 2^32
spaces inside a line would be ignored instead of treated as an error.

Signed-off-by: Gilles Peskine <[email protected]>
Instead of doing constant-flow table lookup, which requires 64 memory loads
for each lookup into a 64-entry table, do a range-based calculation, which
requires more CPU instructions per range but there are only 5 ranges.

I expect a significant performance gain (although smaller than for decoding
since the encoding table is half the size), but I haven't measured. Code
size is slightly smaller.

Signed-off-by: Gilles Peskine <[email protected]>
n was used for two different purposes. Give it a different name the second
time. This does not seem to change the generated code when compiling with
optimization for size or performance.

Signed-off-by: Gilles Peskine <[email protected]>
To test c <= high, instead of testing the sign of (high + 1) - c, negate the
sign of high - c (as we're doing for c - low). This is a little easier to
read and shaves 2 instructions off the arm thumb build with
arm-none-eabi-gcc 7.3.1.

Signed-off-by: Gilles Peskine <[email protected]>
I had originally thought to support directories with
mbedtls_x509_crt_parse_path but it would have complicated the code more than
I cared for. Remove a remnant of the original project in the documentation.

Signed-off-by: Gilles Peskine <[email protected]>
Add unit tests for mask_of_range(), enc_char() and dec_value().

When constant-flow testing is enabled, verify that these functions are
constant-flow.

Signed-off-by: Gilles Peskine <[email protected]>
This is part of the definition of the encoding, not a choice of test
parameter, so keep it with the test code.

Signed-off-by: Gilles Peskine <[email protected]>
digits is also a local variable in host_test.function, leading to compilers
complaining about that shadowing the global variable in
test_suite_base64.function.

Signed-off-by: Gilles Peskine <[email protected]>
Signed-off-by: Gilles Peskine <[email protected]>
Signed-off-by: Gilles Peskine <[email protected]>
@gilles-peskine-arm gilles-peskine-arm added bug needs-review Every commit must be reviewed by at least two team members, needs-backports Backports are missing or are pending review and approval. component-crypto Crypto primitives and low-level interfaces needs-reviewer This PR needs someone to pick it up for review labels Oct 25, 2021
@mpg mpg self-requested a review October 27, 2021 08:35
Copy link
Contributor

@mpg mpg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving as a faithful forward-port of the 2.x version.

@mpg
Copy link
Contributor

mpg commented Oct 27, 2021

IMO the forward-port was straightforward enough that this doesn't need a second reviewer.

@mpg mpg added single-reviewer This PR qualifies for having only one reviewer and removed needs-review Every commit must be reviewed by at least two team members, needs-reviewer This PR needs someone to pick it up for review labels Oct 27, 2021
@mpg mpg added approved Design and code approved - may be waiting for CI or backports and removed needs-backports Backports are missing or are pending review and approval. labels Oct 27, 2021
@mpg mpg merged commit 475bfe6 into Mbed-TLS:development Oct 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Design and code approved - may be waiting for CI or backports bug component-crypto Crypto primitives and low-level interfaces single-reviewer This PR qualifies for having only one reviewer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Huge certificate parsing speed regression between mbedtls 2.16.9 and 2.16.10 (constant time base64).
2 participants