Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib: os: assert: Add unreachable path hint for assertion failure #48710

Merged
merged 4 commits into from
Aug 10, 2022

Conversation

stephanosio
Copy link
Member

@stephanosio stephanosio commented Aug 4, 2022

This commit adds the CODE_UNREACHABLE hint at the end of the
assertion failure branch so that the compiler takes note of the assert
function not returning when an assertion fails.

This prevents the compiler from generating misguided warnings assuming
the asserted execution paths.

It also introduces the ASSERT_TEST Kconfig symbol, which indicates
that the "assert test mode" is enabled. This symbol may be selected by
the tests that require the assert post action function to return
without aborting so that the test can proceed.

Note that the CODE_UNREACHABLE hint is specified only when the assert
test mode is disabled in order to prevent the tests from crashing when
the assert post action function returns.

Signed-off-by: Stephanos Ioannidis [email protected]

p.s. an example of the "misguided warnings": zephyrproject-rtos/sdk-ng#530 (reply in thread)

@zephyrbot zephyrbot added the area: Base OS Base OS Library (lib/os) label Aug 4, 2022
@zephyrbot zephyrbot requested review from andyross and dcpleung August 4, 2022 16:55
@stephanosio stephanosio mentioned this pull request Aug 4, 2022
49 tasks
@stephanosio
Copy link
Member Author

It looks like I just opened a Pandora's box ... converting to a draft.

@stephanosio stephanosio marked this pull request as draft August 4, 2022 17:26
@stephanosio
Copy link
Member Author

stephanosio commented Aug 5, 2022

Besides the build failures due to the now-working warnings, the run-time failures (e.g. in tests/ztest/error_hook) are due to the "assert-in-ISR" test implementation relying on the __ASSERT to return and exit the ISR through the normal path; hence, encountering the CODE_UNREACHABLE.

@stephanosio stephanosio force-pushed the assert_unreach_hint branch from d765f76 to aea23e9 Compare August 5, 2022 10:59
@stephanosio
Copy link
Member Author

stephanosio commented Aug 5, 2022

The tests/arch/common/semihost/arch.common.semihost failure on qemu_riscv32e is due to the QEMU requiring all three semihosting call instructions to lie in the same page:

    if ((pre_addr & TARGET_PAGE_MASK) == (post_addr & TARGET_PAGE_MASK)) {
        pre    = opcode_at(&ctx->base, pre_addr);
        ebreak = opcode_at(&ctx->base, ebreak_addr);
        post   = opcode_at(&ctx->base, post_addr);
    }

It just happened to be that the changes in this PR caused the semihost_exec function to be placed near a page boundary (0x1000) causing the semihosting call instructions to lie in two separate pages.

@keith-packard any suggestions on how we should address this?

p.s. for now, I have created #48745 to work around this issue.

Copy link
Collaborator

@keith-packard keith-packard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the API is just broken and 'chan' should be an unsigned type? I realize that's a very different change, but it seems like it might be better in the long run.

I'll note that soc/arm/nordic_nrf/nrf32/sync_rtc.c uses 'uint8_t' to hold the channel number.

@stephanosio
Copy link
Member Author

Looks like the API is just broken and 'chan' should be an unsigned type? I realize that's a very different change, but it seems like it might be better in the long run.

I'll note that soc/arm/nordic_nrf/nrf32/sync_rtc.c uses 'uint8_t' to hold the channel number.

Ideally, it should be an unsigned type, but I did not want to make more changes than necessary for now. By the way, that commit was only added to this PR for testing purpose; it is originally from #48729, which already got merged.

@stephanosio
Copy link
Member Author

@keith-packard re #48710 (comment), should it be updated to require the trap instructions to be placed in the same page only if the MMU is enabled?

@keith-packard
Copy link
Collaborator

@keith-packard re #48710 (comment), should it be updated to require the trap instructions to be placed in the same page only if the MMU is enabled?

Sure, that would be in conformance with the spec. The picolibc version doesn't bother with the special case, but as zephyr knows whether there's an MMU in use, it could. Given that it's only for semihosting, which is unlikely to be used in production, I'm not sure it's all that important though?

@keith-packard
Copy link
Collaborator

Ideally, it should be an unsigned type, but I did not want to make more changes than necessary for now. By the way, that commit was only added to this PR for testing purpose; it is originally from #48729, which already got merged.

Oh well. Other than that, yes, these changes look good to me.

@stephanosio
Copy link
Member Author

Given that it's only for semihosting, which is unlikely to be used in production, I'm not sure it's all that important though?

Fair enough, it is simple enough to work around as done in #48745. While it would be nice if QEMU properly handles the non-MMU case, it is really not that important.

@stephanosio stephanosio force-pushed the assert_unreach_hint branch 2 times, most recently from 2fc3409 to aa6dbf2 Compare August 8, 2022 12:37
This commit adds the `CODE_UNREACHABLE` hint at the end of the
assertion failure branch so that the compiler takes note of the assert
function not returning when an assertion fails.

This prevents the compiler from generating misguided warnings assuming
the asserted execution paths.

It also introduces the `ASSERT_TEST` Kconfig symbol, which indicates
that the "assert test mode" is enabled. This symbol may be selected by
the tests that require the assert post action function to return
without aborting so that the test can proceed.

Note that the `CODE_UNREACHABLE` hint is specified only when the assert
test mode is disabled in order to prevent the tests from crashing when
the assert post action function returns.

Signed-off-by: Stephanos Ioannidis <[email protected]>
This commit enables the assert test mode (`CONFIG_ASSERT_TEST`) for the
ztest error hook test because it implements a custom post assert fail
hook (`ztest_post_assert_fail_hook`) that returns without aborting to
faciliate the testing of the assert functions.

Signed-off-by: Stephanos Ioannidis <[email protected]>
This commit enables the assert test mode (`CONFIG_ASSERT_TEST`) for the
ARM interrupt test because it relies on the assert function to return
without aborting in the in-ISR "Intentional assert" test.

Signed-off-by: Stephanos Ioannidis <[email protected]>
This commit enables the assert test mode (`CONFIG_ASSERT_TEST`) for the
ARM interrupt test because it relies on the assert function to return
without aborting in the "Assert occurring inside kernel panic" test.

Signed-off-by: Stephanos Ioannidis <[email protected]>
@stephanosio stephanosio force-pushed the assert_unreach_hint branch from aa6dbf2 to e8aa6ca Compare August 8, 2022 13:19
@stephanosio stephanosio marked this pull request as ready for review August 8, 2022 13:57
@stephanosio stephanosio requested a review from ioannisg as a code owner August 8, 2022 13:57
@stephanosio
Copy link
Member Author

Ready for review

@keith-packard
Copy link
Collaborator

Hrm. Having reviewed the code in a bit more detail, I find that I'm now more confused than I was before. CODE_UNREACHABLE is usually defined as __builtin_unreachable() (except on POSIX). That should mean the compiler is checking to make sure there's no way to get to that point. For that to pass, __ASSERT_POST_ACTION(assert_post_action()) should have to be marked marked as FUNC_NORETURN. How does this even build now?

@stephanosio
Copy link
Member Author

For that to pass, __ASSERT_POST_ACTION(assert_post_action()) should have to be marked marked as FUNC_NORETURN

Under normal circumstances, the assert_post_action never returns; however, there are some test cases (mostly checking if the assert functions work properly) that rely on this function to return in order to allow the test to proceed, which I worked around in this PR by introducing the CONFIG_ASSERT_TEST and making the __ASSERT_UNREACHABLE defined as nothing when it is selected.

@keith-packard
Copy link
Collaborator

Under normal circumstances, the assert_post_action never returns

Yes, that part I followed, but (as I said), this ended up making me more confused -- this normal case now includes an annotation that code past the call to assert_post_action is unreachable, and yet, in the normal case, assert_post_action isn't marked with FUNC_NORETURN?

Note -- I'm not saying that your patch isn't right and good (which it seems to be), only that I can't understand why the compiler doesn't emit errors while compiling it due to the mismatching attributes. Probably just something I've missed.

@aaronemassey aaronemassey removed their request for review August 8, 2022 22:48
@stephanosio
Copy link
Member Author

only that I can't understand why the compiler doesn't emit errors while compiling it due to the mismatching attributes.

__builtin_unreachable() (aka. CODE_UNREACHABLE in Zephyr) is a hint that tells the compiler that nothing will execute after it in case the compiler cannot figure that out on its own. It is expected of the GCC to not emit warnings or errors about its "reachability" because its purpose is to tell the compiler that it is unreachable in spite of it being seemingly reachable.

From https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html:

It is useful in situations where the compiler cannot deduce the unreachability of the code.

Another use for __builtin_unreachable is following a call a function that never returns but that is not declared __attribute__((noreturn))

The reason I did not add __attribute__((noreturn)) (FUNC_NORETURN in Zephyr) to the assert_post_action is that, as noted in the commit message, it may return when running some tests.

@keith-packard
Copy link
Collaborator

Thanks much for explaining what's going on. I "should" have known what __builtin_unreachable() did, having used it myself in the past...

The reason I did not add __attribute__((noreturn)) (FUNC_NORETURN in Zephyr) to the assert_post_action is that, as noted in the commit message, it may return when running some tests.

Is that only true when CONFIG_ASSERT_TEST is defined? Because you could use that to declare those functions as FUNC_NORETURN in that case. I think that would mean that you might not need to add __ASSERT_UNREACHABLE?

@stephanosio
Copy link
Member Author

Is that only true when CONFIG_ASSERT_TEST is defined?

Correct.

Because you could use that to declare those functions as FUNC_NORETURN in that case. I think that would mean that you might not need to add __ASSERT_UNREACHABLE?

Also correct. However, FUNC_NORETURN actually generates a warning when the function may potentially return, so using it would require us to either conditionally add CODE_UNREACHABLE to all assert_post_action implementations or modify the chain of calls that are not supposed to return (e.g. k_panic called from assert_post_action), which, for some, also need to be conditionally handled.

In short, adding FUNC_NORETURN to assert_post_action introduces many unnecessary complications, so I decided to go with CODE_UNREACHABLE in the ASSERT macros.

@keith-packard
Copy link
Collaborator

In short, adding FUNC_NORETURN to assert_post_action introduces many unnecessary complications, so I decided to go with CODE_UNREACHABLE in the ASSERT macros.

Yeah, and the win for fixing it all would be quite small -- the potential compiler optimizations are not interesting in this path, and even the compiler messages would be of marginal value when all of the names are 'abort/fatal/...'.

Copy link
Collaborator

@keith-packard keith-packard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking time to explain why this patch looks the way it does.

Copy link
Contributor

@andyross andyross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been hiding from the minutiae of the interactions being discussed, but the code as it appears in the final PR seems totally reasonable.

@carlescufi carlescufi merged commit 5ceda9f into zephyrproject-rtos:main Aug 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: ARM ARM (32-bit) Architecture area: Base OS Base OS Library (lib/os) area: Debugging area: Testsuite Testsuite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants