Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce the number of allocations in ECP operations #3512

Merged

Conversation

gilles-peskine-arm
Copy link
Contributor

@gilles-peskine-arm gilles-peskine-arm commented Jul 22, 2020

Results from #3511 show that the ECP code is where the library is at it worst when it comes to calls to calloc(). The goal of this PR is to make a few easy improvements.

I think there are more low-hanging fruit, but there's already a significant reduction in the number of allocations (~halved the allocations in test_suite_ecp). Some of the improvements result in a performance improvements, others have no measurable impact on performance. The code size is almost unchanged.

Also:

An earlier version of this PR included a framework for instrumenting and measuring memory allocations. This framework is still experimental and I've removed it from the history. It's available in https://github.com/gilles-peskine-arm/mbedtls/tree/ecp-alloc-202007-3 and in #3511 if you want, but if all you want is to count calloc calls (as opposed to figuring out what triggers them) ordinary profiling is enough.

An earlier version of this PR included some bug fixes which have now been merged separately through #3942.

@gilles-peskine-arm gilles-peskine-arm added enhancement component-crypto Crypto primitives and low-level interfaces needs-preceding-pr Requires another PR to be merged first labels Jul 22, 2020
@gilles-peskine-arm gilles-peskine-arm self-assigned this Jul 22, 2020
@gilles-peskine-arm gilles-peskine-arm force-pushed the ecp-alloc-202007 branch 2 times, most recently from a8a1cd8 to 527e6fb Compare September 29, 2020 22:59
@gilles-peskine-arm gilles-peskine-arm added needs-review Every commit must be reviewed by at least two team members, and removed needs-preceding-pr Requires another PR to be merged first labels Sep 29, 2020
@mpg mpg self-requested a review September 30, 2020 08:51
@mpg
Copy link
Contributor

mpg commented Sep 30, 2020

Note: there are a number of CI failures which seem real.

@gilles-peskine-arm gilles-peskine-arm force-pushed the ecp-alloc-202007 branch 2 times, most recently from c1528dc to 8873a50 Compare September 30, 2020 16:07
@gilles-peskine-arm
Copy link
Contributor Author

The massive build failures were due to a bad rebase: originally “Fix missing mbedtls_psa_crypto_free in test case” added missing cleanup to a test case, and the automatic rebase added that cleanup to the wrong test case. Fortunately the correct test case had been fixed independently in the meantime (in feb0396).

After that I fixed some missing compilation guards and CI is now looking good.

@daverodgman daverodgman added the needs-reviewer This PR needs someone to pick it up for review label Oct 5, 2020
@mpg mpg removed their request for review October 13, 2020 10:48
@gilles-peskine-arm gilles-peskine-arm added needs-work and removed needs-review Every commit must be reviewed by at least two team members, labels Dec 7, 2020
If you pass a curve name to the benchmark program, the ECDH and ECDSA
benchmarks will only run for that particular curve. By default, all
curves are benchmarked.

To simplify the implementation, if you pass multiple curves, only the
last one will be benchmarked.

Signed-off-by: Gilles Peskine <[email protected]>
Reduce the code size, stack consumption and heap consumption in
fix_negative by encoding the special-case subtraction manually.

* Code size: ecp_curves.o goes down from 7837B down to 7769 in a
  sample Cortex-M0 build with all curves enabled. The savings come
  from not having to set up C in INIT (which is used many times) and
  from not having to catch errors in fix_negative.
* Stack consumption: get rid of C on the stack.
* Heap: mbedtls_mpi_sub_abs with destination == second operand would
  make a heap allocation. The new code doesn't do any heap allocation.
* Performance: no measurable difference.

Signed-off-by: Gilles Peskine <[email protected]>
mbedtls_mpi_sub_abs systematically allocated a new mpi when the result
was aliased with the right operand (i.e. X = A - X). This aliasing
very commonly happens during ECP operations. Rewrite the function to
allocate only if the result might not fit otherwise.

This costs a few bytes of code size in bignum.o, and might make
mbedtls_mpi_sub_abs very very slightly slower when no reallocation is
done. However, there is a substantial performance gain in ECP
operations with Montgomery curves (10-20% on my PC).

test_suite_ecp drops from 1422794 to 1271506 calls to calloc().

This commit also fixes a bug whereby mbedtls_mpi_sub_abs would leak
memory when X == B (so TB was in use) and the result was negative.

Signed-off-by: Gilles Peskine <[email protected]>
Signed-off-by: Gilles Peskine <[email protected]>
Rewrite mbedtls_mpi_mul_int to call mpi_mul_hlp directly rather than
create a temporary mpi object. This has the benefit of not performing
an allocation when the multiplication is in place (mpi operand aliased
with the result) and the result mpi is large enough.

This saves about 40% of the calloc() calls in test_suite_ecp. There is
no measurable performance difference on my Linux PC.

The cost is a few bytes in bignum.o.

When there is no aliasing, or when there is aliasing but the mpi
object needs to be enlarged, the performance difference is negligible.

Signed-off-by: Gilles Peskine <[email protected]>
If c == 0, no need to add it to *d.

Signed-off-by: Gilles Peskine <[email protected]>
@mpg mpg added the needs-reviewer This PR needs someone to pick it up for review label Mar 10, 2021
@mpg
Copy link
Contributor

mpg commented Mar 10, 2021

I meant to review this because it's of interest to me, but I don't have bandwidth at the moment, so I'm removing myself and labeling "needs: reviewer" again in case somebody else wants to pick this up.

@gilles-peskine-arm
Copy link
Contributor Author

This isn't a bug anymore, right?

Right. Originally this pull request contained a few minor bug fixes that I found while I was working on the allocations, but I made a separate PR for the bug fixes so they could get in faster.

@gabor-mezei-arm gabor-mezei-arm self-requested a review April 1, 2021 08:17
@mstarzyk-mobica mstarzyk-mobica removed the needs-reviewer This PR needs someone to pick it up for review label Apr 1, 2021
@gabor-mezei-arm gabor-mezei-arm added approved Design and code approved - may be waiting for CI or backports and removed needs-review Every commit must be reviewed by at least two team members, labels Apr 1, 2021
@gilles-peskine-arm gilles-peskine-arm merged commit d520037 into Mbed-TLS:development Apr 1, 2021
gilles-peskine-arm added a commit to gilles-peskine-arm/mbedtls that referenced this pull request Apr 3, 2021
Write a simple unit test for mbedtls_ecp_muladd().

Add just one pair of test cases. Mbed-TLS#2 fails since PR Mbed-TLS#3512. Thanks to
Philippe Antoine (catenacyber) for the test case, found by ecfuzzer.

Signed-off-by: Gilles Peskine <[email protected]>
gilles-peskine-arm added a commit to gilles-peskine-arm/mbedtls that referenced this pull request Apr 3, 2021
Write a simple unit test for mbedtls_ecp_muladd().

Add just one pair of test cases. Mbed-TLS#2 fails since PR Mbed-TLS#3512. Thanks to
Philippe Antoine (catenacyber) for the test case, found by ecfuzzer.

Signed-off-by: Gilles Peskine <[email protected]>
gilles-peskine-arm added a commit to gilles-peskine-arm/mbedtls that referenced this pull request Apr 3, 2021
Write a simple unit test for mbedtls_ecp_muladd().

Add just one pair of test cases. Mbed-TLS#2 fails since PR Mbed-TLS#3512. Thanks to
Philippe Antoine (catenacyber) for the test case, found by ecfuzzer.

Signed-off-by: Gilles Peskine <[email protected]>
gilles-peskine-arm added a commit that referenced this pull request Apr 21, 2021
Write a simple unit test for mbedtls_ecp_muladd().

Add just one pair of test cases. #2 fails since PR #3512. Thanks to
Philippe Antoine (catenacyber) for the test case, found by ecfuzzer.

Signed-off-by: Gilles Peskine <[email protected]>
daverodgman pushed a commit that referenced this pull request Apr 23, 2021
Reduce the number of allocations in ECP operations
daverodgman pushed a commit that referenced this pull request Apr 23, 2021
Write a simple unit test for mbedtls_ecp_muladd().

Add just one pair of test cases. #2 fails since PR #3512. Thanks to
Philippe Antoine (catenacyber) for the test case, found by ecfuzzer.

Signed-off-by: Gilles Peskine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Design and code approved - may be waiting for CI or backports component-crypto Crypto primitives and low-level interfaces enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants