Workaround random `test_suite_platform` fail in time test #7419

yuhaoth · 2023-04-10T09:07:53Z

Description

We got random test_suite_platform fails in CI tests
The CI reports with this PR

Time: delay seconds ............................................... FAILED
  elapsed_secs >= delay_secs
  at line 87, /var/lib/build/tests/suites/test_suite_platform.function

I think it is due to time(time_t*) returns Epoch time which is discontinuous.

Gatekeeper checklist

changelog Not required, this is testing
backport .Not required, test not present in LTS
tests This is a tests change

Notes for the submitter

Please refer to the contributing guidelines, especially the
checklist for PR contributors.

Signed-off-by: Jerry Yu <[email protected]>

tests/suites/test_suite_platform.function

Signed-off-by: Jerry Yu <[email protected]>

xkqian

LGTM

gilles-peskine-arm · 2023-04-17T13:32:19Z

tests/suites/test_suite_platform.function

@@ -84,7 +84,16 @@ void time_delay_seconds(int delay_secs)
    sleep_ms(delay_secs * 1000);


Do we really need these tests?

Historically, we had somewhat similar tests in the timing module. They often failed on the CI and so we ended up removing them. I fear that we've reintroduced the problem, and this pull request is just one of many.

Got it.

Should I remove these tests here? Or another PR?

Delay test were removed.

I think we should keep *get

Completely removing the tests may be too much. How about keeping the calls to the functions (so at least we know they don't e.g. crash), but not testing delays?

Possibly even check that t1 = ms_time(); sleep(small); t2 = ms_time(); ASSERT(t2 > t1). @yuhaoth Can you clarify

Built-in mbedtls_time function returns the number of seconds since the
Epoch. That is affected by discontinuous jumps and cause test fail.
Workaround it with 1 seconds tollerance.
Epoch. That is affected by discontinuous jumps. And nanosleep use
CLOCK_MONOTONIC(monotonically-increasing time source), That will cause
negative elapsed time difference.

What can cause a negative elapsed time difference? E.g. Can this happen from automatic drift adjustment? I would have expected that t2 - t1 might be less than small, but not negative.

Can this happen from automatic drift adjustment?

It happens only in time_delay_seconds because time source are different. Built-in mbedtls_time was defined as standard time function, the time source is CLOCK_REALTIME . And nanosleep take CLOCK_MONOTONIC as time source.

If CLOCK_MONOTONIC is faster than CLOCK_REALTIME and CLOCK_REALTIME was adjusted during sleep, sometime t2 - t1 < small happens.

And I think time_delay_milliseconds should not be removed now :) . It use same time source.

Possibly even check that t1 = ms_time(); sleep(small); t2 = ms_time(); ASSERT(t2 > t1).

This check can not resolve the issue.

I just update the comments. But I did not mention automatic drift adjustment. I think that's enough.

And I revert last commit without Time: delay seconds test

See Mbed-TLS#1517. They often failed on the CI. Signed-off-by: Jerry Yu <[email protected]>

gilles-peskine-arm · 2023-04-18T07:59:10Z

tests/suites/test_suite_platform.function

-    usleep(milliseconds * 1000);
-#endif
-}
-#endif


CI says this endif needs to stay

It should be fixed

Signed-off-by: Jerry Yu <[email protected]>

The test has some issues we can not avoid. Put it in code to avoid it is re-inroduced again Signed-off-by: Jerry Yu <[email protected]>

tom-cosgrove-arm · 2023-04-18T10:44:19Z

tests/suites/test_suite_platform.function

+     * CLOCK_REALTIME and returns the number of seconds since the Epoch. And
+     * `nanosleep` uses CLOCK_MONOTONIC. The time sources are out of sync.
+     *
+     * If CLOCK_MONOTONIC is faster than CLOCK_REALTIME and `nanosleep` exits at


Surely it's not really about whether one clock is "faster" than the other (I would expect them to tick at the same rate all other things being equal) but if one is adjusted and the other not - which is what happens when one is a "wallclock" timer and the other is a monotonic timer

Yes, that's more easily understand. I will change that.

But I think this problem can be abstracted into two different rate clock problems. The wall clock is come from remote and another one come from local. Due to implementation issues，wall clock shows discontinue jumps problem. If wall clock is updated very frequently，discontinue jumps will disappear. And user will get two different rate clocks.

I would expect them to tick at the same rate all other things being equal

It should not be expected, CPU monotonic clock source come from crystal oscillator with PLL. They are not high precision. And due to many reason，it might run faster or slower than standard time. That's why we need NTP service to adjust the time.

The wall clock is come from remote

I don't understand. "Wall clock" in this context means that this particular timer from the kernel should match as closely as possible to the time that someone looking at their watch would see. So time may be stepped forwards or backwards as daylight savings happens (for example). A monotonic clock must by definition only ever increase.

The ticks that advance these clocks come from local hardware, which may not be precise. So there is frequency adjustment, which should affect all clocks, so that when each clock says "one second has passed" as close to one second has possible has actually passed

"Wall clock" in this context means that this particular timer from the kernel should match as closely as possible to the time that someone looking at their watch would see.

I mean it appears to be from a remote server if updated fast enough.

So time may be stepped forwards or backwards as daylight savings happens (for example). A monotonic clock must by definition only ever increase.

I do not think daylight savings will affect the value of time() , for it is the number of seconds since the Epoch .

which should affect all clocks

No. CLOCK_BOOTTIME and CLOCK_*_CPUTIME_ID will not be affected by time adjustment. CLOCK_MONOTONIC is affected by the incremental adjustments performed, that's different with CLOCK_REALTIME.
If decreasing adjustment peformed, CLOCK_REALTIME will change and CLOCK_MONOTONIC will not change.

Signed-off-by: Jerry Yu <[email protected]>

gilles-peskine-arm

Looks good to me. I'm not sure I like time_delay_milliseconds as it is, but I don't want to increase the scope of this pull request so I am not requesting any changes there. The priority is to avoid random failures.

gilles-peskine-arm · 2023-04-19T08:54:32Z

tests/suites/test_suite_platform.function

@@ -76,6 +74,13 @@ void time_delay_milliseconds(int delay_ms)
 /* END_CASE */

 /* BEGIN_CASE depends_on:MBEDTLS_HAVE_TIME */
+
+/*
+ * WARNING: DONOT ENABLE THIS TEST. RESERVE IT HERE TO KEEP THE REASON.


Minor: English:

Suggested change

* WARNING: DONOT ENABLE THIS TEST. RESERVE IT HERE TO KEEP THE REASON.

* WARNING: DO NOT ENABLE THIS TEST. We keep the code here to document the reason.

Yes, could this change be made, then I will approve

Signed-off-by: Jerry Yu <[email protected]>

gilles-peskine-arm

I'm approving this because I'd like the random failues to stop. I'm not fully happy with keeping dead code, but we can remove it later.

xkqian

LGTM

tom-cosgrove-arm · 2023-04-28T10:53:50Z

@xkqian If you're the second person approving this PR, could you set the approved label and remove needs-review?

xkqian · 2023-05-04T01:08:51Z

@xkqian If you're the second person approving this PR, could you set the approved label and remove needs-review?

Sorry, I thought maybe also need your approval and forgot to ping you @tom-cosgrove-arm . I will take care next time. Thanks.

yuhaoth marked this pull request as draft April 10, 2023 09:08

yuhaoth added the DO-NOT-MERGE label Apr 10, 2023

yuhaoth added 2 commits April 11, 2023 14:07

try to reproduce random assert fail

fce8577

Signed-off-by: Jerry Yu <[email protected]>

workaround the assert fail with tollerance

c9c3e62

Signed-off-by: Jerry Yu <[email protected]>

yuhaoth force-pushed the test/random-time-test-fail branch from dfb63fa to c9c3e62 Compare April 11, 2023 06:18

yuhaoth changed the title ~~WIP: test random test fails~~ Workaround random test_suite_platform fail Apr 11, 2023

yuhaoth added bug needs-review Every commit must be reviewed by at least two team members, needs-ci Needs to pass CI tests needs-reviewer This PR needs someone to pick it up for review priority-high High priority - will be reviewed soon and removed DO-NOT-MERGE labels Apr 11, 2023

yuhaoth marked this pull request as ready for review April 11, 2023 07:50

yuhaoth added component-test Test framework and CI scripts and removed needs-ci Needs to pass CI tests labels Apr 13, 2023

xkqian suggested changes Apr 17, 2023

View reviewed changes

tests/suites/test_suite_platform.function Outdated Show resolved Hide resolved

tests/suites/test_suite_platform.function Outdated Show resolved Hide resolved

tests/suites/test_suite_platform.function Outdated Show resolved Hide resolved

fix comments issues

2f1e85f

Signed-off-by: Jerry Yu <[email protected]>

yuhaoth force-pushed the test/random-time-test-fail branch from 398386d to 2f1e85f Compare April 17, 2023 08:53

xkqian previously approved these changes Apr 17, 2023

View reviewed changes

gilles-peskine-arm reviewed Apr 17, 2023

View reviewed changes

yuhaoth mentioned this pull request Apr 18, 2023

AES: Add accelerator only mode #7384

Merged

3 tasks

yuhaoth dismissed xkqian’s stale review via 1d7ddfb April 18, 2023 05:50

yuhaoth requested review from xkqian and gilles-peskine-arm April 18, 2023 05:51

remove time delay tests

4852bb8

See Mbed-TLS#1517. They often failed on the CI. Signed-off-by: Jerry Yu <[email protected]>

yuhaoth force-pushed the test/random-time-test-fail branch from 1d7ddfb to 4852bb8 Compare April 18, 2023 07:02

gilles-peskine-arm reviewed Apr 18, 2023

View reviewed changes

yuhaoth added 2 commits April 18, 2023 17:01

Update comments and remove delay seconds test

d1190a5

Signed-off-by: Jerry Yu <[email protected]>

Add warning to reserve the reason

ed9b9a7

The test has some issues we can not avoid. Put it in code to avoid it is re-inroduced again Signed-off-by: Jerry Yu <[email protected]>

tom-cosgrove-arm reviewed Apr 18, 2023

View reviewed changes

Improve comments about the time_delay test.

d3c7d53

Signed-off-by: Jerry Yu <[email protected]>

gilles-peskine-arm previously approved these changes Apr 19, 2023

View reviewed changes

tom-cosgrove-arm removed the needs-reviewer This PR needs someone to pick it up for review label Apr 19, 2023

fix grammar issues

ad2091d

Signed-off-by: Jerry Yu <[email protected]>

yuhaoth dismissed gilles-peskine-arm’s stale review via ad2091d April 20, 2023 02:01

yuhaoth requested review from gilles-peskine-arm and tom-cosgrove-arm April 20, 2023 02:02

gilles-peskine-arm mentioned this pull request Apr 24, 2023

Init PSA in ssl and x509 programs #7443

Merged

3 tasks

gilles-peskine-arm changed the title ~~Workaround random test_suite_platform fail~~ Workaround random test_suite_platform fail in time test Apr 28, 2023

gilles-peskine-arm mentioned this pull request Apr 28, 2023

Fix test gap in PK write: private (opaque) -> public #7496

Merged

3 tasks

gilles-peskine-arm approved these changes Apr 28, 2023

View reviewed changes

xkqian approved these changes Apr 28, 2023

View reviewed changes

gilles-peskine-arm removed the needs-review Every commit must be reviewed by at least two team members, label Apr 28, 2023

gilles-peskine-arm merged commit 14d6b11 into Mbed-TLS:development Apr 28, 2023

yuhaoth deleted the test/random-time-test-fail branch December 6, 2023 05:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Workaround random `test_suite_platform` fail in time test #7419

Workaround random `test_suite_platform` fail in time test #7419

yuhaoth commented Apr 10, 2023 •

edited by minosgalanakis

Loading

xkqian left a comment

gilles-peskine-arm Apr 17, 2023

yuhaoth Apr 18, 2023

yuhaoth Apr 18, 2023

gilles-peskine-arm Apr 18, 2023

yuhaoth Apr 18, 2023

yuhaoth Apr 18, 2023 •

edited

Loading

gilles-peskine-arm Apr 18, 2023

yuhaoth Apr 18, 2023

tom-cosgrove-arm Apr 18, 2023

yuhaoth Apr 19, 2023 •

edited

Loading

tom-cosgrove-arm Apr 19, 2023

yuhaoth Apr 19, 2023

gilles-peskine-arm left a comment

gilles-peskine-arm Apr 19, 2023

tom-cosgrove-arm Apr 19, 2023

gilles-peskine-arm left a comment

xkqian left a comment

tom-cosgrove-arm commented Apr 28, 2023 •

edited

Loading

xkqian commented May 4, 2023 •

edited

Loading

		@@ -84,7 +84,16 @@ void time_delay_seconds(int delay_secs)
		sleep_ms(delay_secs * 1000);

	* WARNING: DONOT ENABLE THIS TEST. RESERVE IT HERE TO KEEP THE REASON.
	* WARNING: DO NOT ENABLE THIS TEST. We keep the code here to document the reason.

Workaround random test_suite_platform fail in time test #7419

Workaround random test_suite_platform fail in time test #7419

Conversation

yuhaoth commented Apr 10, 2023 • edited by minosgalanakis Loading

Description

Gatekeeper checklist

Notes for the submitter

xkqian left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yuhaoth Apr 18, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yuhaoth Apr 19, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gilles-peskine-arm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gilles-peskine-arm left a comment

Choose a reason for hiding this comment

xkqian left a comment

Choose a reason for hiding this comment

tom-cosgrove-arm commented Apr 28, 2023 • edited Loading

xkqian commented May 4, 2023 • edited Loading

Workaround random `test_suite_platform` fail in time test #7419

Workaround random `test_suite_platform` fail in time test #7419

yuhaoth commented Apr 10, 2023 •

edited by minosgalanakis

Loading

yuhaoth Apr 18, 2023 •

edited

Loading

yuhaoth Apr 19, 2023 •

edited

Loading

tom-cosgrove-arm commented Apr 28, 2023 •

edited

Loading

xkqian commented May 4, 2023 •

edited

Loading