-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mbedtls_mpi_self_test fails on Cortex M33 microcontroller #5266
Comments
Commenting out #define MBEDTLS_HAVE_ASM makes the exp_mod test pass, but the "ECP SW test #2" still fails. Edit due to new insights: it only passes sometimes with this change. Not always. |
Just to confirm, are you using a recent version of clang? |
That is surprising: I'm not aware of a similar bug that doesn't involve one of the assembly snippets. Can you run the unit tests on your platform? |
We're using arm-none-eabi-gcc (GNU Arm Embedded Toolchain 9-2020-q2-update) 9.3.1 20200408 (release) I'm not sure what a target triple is, but we're using a custom target based on STM32L562ZE chip. Compile options for bignum.c (which includes bn_mul.h) mbedtls config: |
Can you elaborate? The unit tests are part of the self test, right? I performed those several times, you can see the results at the top of this issue. The error I'm getting with the other test seems to be -0x4c80 (invalid key) I'm a bit confused, because the failing unit tests seem to not be consistent. I'm getting failed test on exp_mod sometimes even with MBEDTLS_HAVE_ASM undefined, depending on if I add instructions elsewhere in the code.... Right now it seems that undefining MBEDTLS_HAVE_ASM and defining MBEDTLS_NO_UDBL_DIVISION together solve both of the failing tests, but I can't say I understand at all. What also works, is setting entire bignum.c module to compile with -O0 with a pragma at top of file. |
No:
That part could still be the assembly bug: they cause data corruption, and so far we've only observed corruption that led to massive operation failures and crashes, but it could be something more subtle that only corrupts useful data in some specific heap patterns.
But that can't be the assembly bug. #4116 is a bug in the assembly code, which didn't correctly declare to the compiler what memory the assembly accessed. With Random shot in the dark: does increasing the stack size or the heap size help? |
@gilles-peskine-arm I will try and get unit tests running. As for the stack/heap stuff, I already verified that this isn't the problem, by doing the following things:
There is barely any custom code running at the moment, mostly just the mbedtls self test. |
Hmm that's interesting - thanks. This may be a different problem then, as #4116 and friends seem to be caused by the improved optimisations of newer versions of clang. The compiler you're using is not the latest gcc, so we'd likely have noticed by now if it had the same issue.
The 'target triple' is a clang term for a 3-part platform description (e.g. "arm-linux-gnu") so not relevant in your case. |
Sadly I don't really see how I'd be able to run the unit tests on my target. The target is very limited. I see a bunch of .data and .function files, and I would like to run specific ecp and mpi tests, but I'm unsure how. |
It's also a bit inexplicable at this point. I have reduced the code to a point where the only things that are being done are:
There are several things that somehow cause all unit tests to consistently pass, but I don't have an explanation:
|
That sounds like a hardware problem, or a bug in some system initialization or driver code that sets the hardware into an unstable state. Can you reproduce the problem on another chip? Can you reproduce something similar with another program that's both RAM- and CPU-intensive? |
Although I can't pinpoint the exact underlying cause, it seems like this is the case. When generating the clock config code using LL drivers instead of HAL, I see a bunch more while loops that wait until certain configuration is actually applied. Using this new clock configuration code, the issue is not reproducible anymore on any setup, with any optimization. I still can't really understand how this would manifest in EC/MPI operations failing in mbedTLS, though. |
Hi @lohralfen, I wonder if you've made any progress on your side? Have you been able to confirm whether the problem is with the platform setup or with Mbed TLS? Can you reproduce the failure with |
Hello @gilles-peskine-arm, I concluded there was not much more to test since we've ran a lot of tests with plenty of reboots and the issue does not occur anymore with updated clock configuration code. It would take a lot of time to go through the driver code of ST and pinpoint the exact issue. I will try to make some time on Monday to run the original code with mbedtls-3.1.0. |
Thanks for reporting back! Given that one problem (the inline assembly code) is known to be fixed, and that the other problem (buggy execution even with assembly disabled) is confirmed to be fixed by changing clock settings, there does not seem to be any problem with Mbed TLS, so I'm closing this issue. |
Using STM32L562ZE chip, the mbedtls self test fails with the exact same issue as described in #4116 #4943
OS
None (bare metal)
mbedTLS build:
v3.0.0 tag, also tested with v2.27.0 tag with same result
Expected behavior:
All self tests pass
Actual behavior:
Depending on optimization (at least -Os, -O3) the 'exp_mod' self test fails
CALLOC(0): passed (distinct non-null)
CALLOC(1): passed
CALLOC(1 again): passed (same address)
SHA-224 test #1: passed
SHA-224 test #2: passed
SHA-224 test #3: passed
SHA-256 test #1: passed
SHA-256 test #2: passed
SHA-256 test #3: passed
SHA-384 test #1: passed
SHA-384 test #2: passed
SHA-384 test #3: passed
SHA-512 test #1: passed
SHA-512 test #2: passed
SHA-512 test #3: passed
AES-ECB-128 (dec): passed
AES-ECB-128 (enc): passed
AES-ECB-192 (dec): passed
AES-ECB-192 (enc): passed
AES-ECB-256 (dec): passed
AES-ECB-256 (enc): passed
AES-CTR-128 (dec): passed
AES-CTR-128 (enc): passed
AES-CTR-128 (dec): passed
AES-CTR-128 (enc): passed
AES-CTR-128 (dec): passed
AES-CTR-128 (enc): passed
Base64 encoding test: passed
Base64 decoding test: passed
MPI test #1 (mul_mpi): passed
MPI test #2 (div_mpi): passed
MPI test #3 (exp_mod): failed
Unexpected error, return code = 00000001
ECP SW test #1 (constant op_count, base point G): passed
ECP SW test #2 (constant op_count, other point): failed (0)
Unexpected error, return code = FFFFB380
Executed 7 test suites
[ 2 tests FAIL ]
Steps to reproduce:
Run the self tests of mbedTLS
I'm guessing it has to do with this piece of code?
https://github.com/ARMmbed/mbedtls/blob/development/library/bn_mul.h#L715
This file has been adjusted 1 at a time for each architecture, but it seems not all architectures have been fixed yet..
The text was updated successfully, but these errors were encountered: