Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor calculateClockOffset to fix overflow issues #17

Merged
merged 18 commits into from
May 19, 2021

Conversation

aggarw13
Copy link
Contributor

@aggarw13 aggarw13 commented May 13, 2021

CBMC proof run for Sntp_DeserializeResponse PR #15 flags overflow issue in the calculateClockOffset function when assignining negative value to an unsigned integer. This PR refactors the implementation to avoid overflow flagging by:

  • Storing only positive subtraction values in unsigned integers
  • Assigning a large unsigned integer (i.e. >= 0x80000000) to signed integer by first using its negative value and then inversing the negation after the assignment to obtain the correct value.

This PR also adds resiliency to the clock-offset calculation by adding check against overflow (i.e. when server and client times are more than 34 years apart) on the Client -> Server SNTP request send network path.

@aggarw13 aggarw13 force-pushed the refactor/calculate-clock-offset branch from 3770a6d to f4723d9 Compare May 13, 2021 01:30
@aggarw13 aggarw13 force-pushed the refactor/calculate-clock-offset branch from d1b7183 to 773d71f Compare May 13, 2021 01:33
static bool isEligibileForClockOffsetCalculation( uint32_t firstOrderDiff )
{
return( ( ( firstOrderDiff & CLOCK_OFFSET_FIRST_ORDER_DIFF_OVERFLOW_BITS_MASK ) == 0U ) ||
( ( ( 0U - firstOrderDiff ) & CLOCK_OFFSET_FIRST_ORDER_DIFF_OVERFLOW_BITS_MASK ) == 0U ) );
Copy link

@SaswatPadhi SaswatPadhi May 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think CBMC would flag the 0U - firstOrderDiff as a subtraction that can potentially cause an unsigned integer overflow case. It wouldn't complain if we apply a unary negation though -firstOrderDiff.

I think the intuition here is that a subtraction sometimes introduces a wrap around so that could be unintentional, but a unary negation always introduces a wrap around (except on 0) so CBMC treats it as intentional. Of course, 0U - x is the same as -x but CBMC currently doesn't simplify this binary subtraction to unary negation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As unary minus operator with unsigned integer isn't allowed by MISRA Rule 10.1, we decided to use literal 2's complement operation (i.e. ~value + 1U) to perform negation

source/core_sntp_serializer.c Outdated Show resolved Hide resolved
/* Perform ( T2 - T1 ) offset calculation of SNTP Request packet path. */
firstOrderDiffSend = pServerRxTime->seconds - pClientTxTime->seconds;
{
firstOrderDiffSend = 0U - firstOrderDiffSend;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment regarding binary subtraction vs unary negation applies here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As unary minus operator with unsigned integer isn't allowed by MISRA Rule 10.1, we decided to use literal 2's complement operation (i.e. ~value + 1U) to perform negation.

== UINT32_MOST_SIGNIFICANT_BIT_BIT_MASK )

{
firstOrderDiffRecv = 0U - firstOrderDiffRecv;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment regarding binary subtraction vs unary negation applies here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As unary minus operator with unsigned integer isn't allowed by MISRA Rule 10.1, we decided to use literal 2's complement operation (i.e. ~value + 1U) to perform negation.

source/core_sntp_serializer.c Outdated Show resolved Hide resolved
source/core_sntp_serializer.c Outdated Show resolved Hide resolved
Copy link

@SaswatPadhi SaswatPadhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you for incorporating all the feedback!

The 2's complement part is somewhat unsatisfactory. I would bring it up in our CBMC discussions. CBMC shouldn't be directly contradicting a MISRA rule.

EDIT: We had a discussion regarding this. The consensus was that 2's complement is indeed the right thing to do here. An "arithmetic negation" on a value that has no sign (so no notion of positive or negative) isn't really intuitive -- in the end it's just an unsigned bit pattern.
Instead, it's better to be explicit that we are changing the bit pattern to represent a value that can be interpreted as the negated value in 2's-complement representation for signed integers.

@dan4thewin
Copy link

I'd like to keep this PR open for now so we can solicit further feedback.

@aggarw13 aggarw13 force-pushed the refactor/calculate-clock-offset branch from eceb9cc to d0d8f2b Compare May 13, 2021 22:43
@aggarw13 aggarw13 force-pushed the refactor/calculate-clock-offset branch from 258757b to 52c0399 Compare May 13, 2021 23:29
@aggarw13 aggarw13 force-pushed the refactor/calculate-clock-offset branch from 2d8f016 to 0c777b4 Compare May 13, 2021 23:51
/* Check whether the difference value has the most significant bit set.
* An unsigned integer with the most significant bit set cannot be safely assigned
* to a signed integer.*/
if( ( positiveDiff & UINT32_MOST_SIGNIFICANT_BIT_BIT_MASK ) == UINT32_MOST_SIGNIFICANT_BIT_BIT_MASK )

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Nitpick]

Would it be cleaner to check positiveDiff > INT32_MAX instead? Basically, we would be explicit about the overflow; we can mention that in the comment too -- that the reason we have this if branch is to handle overflows correctly.

If we choose to make this change, we can also remove the definition for UINT32_MOST_SIGNIFICANT_BIT_BIT_MASK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, done that!

*
* @return The calculated signed subtraction value between the unsigned integers.
*/
static int32_t safeSubtraction( uint32_t minuend,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Nitpick]

The function is doing a subtraction and returning a signed value, so do you think we could call it safeSignedSubtraction?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

Copy link

@SaswatPadhi SaswatPadhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks again for making this function bullet proof!

source/core_sntp_serializer.c Outdated Show resolved Hide resolved
source/core_sntp_serializer.c Outdated Show resolved Hide resolved
@aggarw13 aggarw13 force-pushed the refactor/calculate-clock-offset branch from 104ae37 to ceaf23b Compare May 14, 2021 00:05
@courage
Copy link
Member

courage commented May 14, 2021

The current subtraction routine is a big improvement over the previous iteration, but I think you could probably take it a bit farther. E.g. something like

bool trySubtract(uint32_t a, uint32_t b, int32_t* result) {
    int64_t big_a = a;
    int64_t big_b = b;
    int64_t big_result = big_a - big_b;
    if (big_result <= INT32_MAX && big_result >= INT32_MIN) {
        *result = big_result;
        return true;
    } else {
        return false;
    }
}

This gets rid of all the polarity and 2's complement stuff and is IMO easier to understand. IDK if this code makes MISRA happy, but I verified with clang's undefined behavior checker that it doesn't cause any overflows: https://paste.amazon.com/show/mcourage/1620948125.

Since the subtraction routine checks for overflow, you could also get rid of the bit twiddling in isEligibleForClockOffsetCalculation.

@SaswatPadhi
Copy link

SaswatPadhi commented May 14, 2021

@courage Thank you for taking a look.

I agree that extending the integer width to safely accommodate overflows would also solve this issue.
My only (very minor) concern would be possible performance penalty: int64_t operations might be slower than int32_t on 32-bit platforms?

But as you said, this would be much cleaner and we could simplify the isEligibleForClockOffsetCalculation and safeSignedSubtraction functions (perhaps even get rid of them).

@aggarw13
Copy link
Contributor Author

aggarw13 commented May 14, 2021

Yes, my concern also rests for use of int64_t as well for platforms that aren't 64 bit address based, which is why the clock-offset overflow requirement is representation of the value within 32 bits (i.e. 30 bits of absolute value and 2 sign bits).
The NTPv4 specification also clearly states the limitation of the offset value arithmetic:

"However, the offset and delay are computed as sums and differences of these values, which contain 62 significant bits and two sign bits, so they can represent unambiguous values from 34 years in the past to 34 years in the future. In other words, the time of the client must be set within 34 years of the server before the service is started. This is a fundamental limitation with 64-bit integer arithmetic."
(Note that 64 bits in the above paragraph include the "seconds" and the "fractions" part of the NTP timestamp, which are 32 bits each. Within the library, we are calculating the clock-offset only with the "seconds" part of the NTP timestamp)

The isEligibleForClockOffsetCalculation checks that the first order difference exists within 34 years (i.e. representable with 30 bits) in both polarities of subtraction so that the case when NTP time overflow (on 7 Feb 2036), and the server and client are in two different NTP eras (i.e. one before the overflow with a large NTP seconds value greater than INT32_MAX and the other with a small NTP seconds value) is supported. In such a case, a simple single polarity subtraction does not provide the ability to determine that the server and client are actually within 34 years of each other.

@courage
Copy link
Member

courage commented May 14, 2021

The purpose of the code in isEligibleForClockOffsetCalculation is unclear to me even with several read-throughs, so I couldn't say with confidence that it's correct. The comments say that it's checking to see if the delta is less than 2^30 seconds (34 years), but the actual code seems to be doing a bunch of number encoding magic, and I don't understand why it's doing so. It kind of looks like it's trying to do signed arithmetic on a uint32_t, but the argument coming from calculateClockOffset is actually unsigned, so the 2's complement code doesn't make any sense to me.

Though I haven't read through the NTP spec on this subject, I'd be kind of surprised if this "fail when the difference is greater than 34 years" is intended. Is that really what's supposed to happen? I would expect that if my two timestamps are 0 and 0xFFFFFFFF, I should assume that there was a rollover, and that the difference is 1 second, rather than 68 years. calculateClockOffset looks like it would return a failure for that case.

Regarding efficiency of trySubtract, here's the code generated by gcc for 32-bit ARM (-Os):

trySubtract:
        subs    r1, r0, r1
        sbc     r3, r3, r3
        adds    r0, r1, #-2147483648
        adc     r3, r3, #0
        cmp     r3, #0
        moveq   r0, #1
        streq   r1, [r2]
        movne   r0, #0
        bx      lr

You'd really have to be doing a tremendous amount of subtraction for me to worry about the efficiency of this code, and even if you were, I'd tell you to favor simpler, clearer code until you had benchmark data showing a material difference.

@aggarw13
Copy link
Contributor Author

aggarw13 commented May 14, 2021

I would expect that if my two timestamps are 0 and 0xFFFFFFFF, I should assume that there was a rollover, and that the difference is 1 second, rather than 68 years. calculateClockOffset looks like it would return a failure for that case.

Yes, the calculateClockOffset function does support rollover, and the logic within the isEligibleForClockOffsetCalculation performs 2's complement inversion to treat the difference of 0xFFFFFFFF as 1.
Here are the test cases that validate the support for overflow by the calculateClockOffset function: https://github.com/FreeRTOS/coreSNTP/pull/17/files#diff-c6d8814d59f0ac9dcfcb87a69e9802c3e31a09a1b3d4ee298b43dace48dd9777R448-R483

The restriction of 34 years is applied when, for example, the server is in 2036 but the client time is more than 34 years behind sometime in 2002. In this example, the server time would have overflow (i.e. be represented in NTP era 1) but the client time would be represented as a large value in NTP era 0. The calculateClockOffset function would treat this as a violation of the 34 years requirement. However, if the client time is in 2004 which is satisfies the 34 year requirement, the calculateClockOffset would treat support the NTP time overflow case by treating the first order difference as 32 years only.

@aggarw13 aggarw13 force-pushed the refactor/calculate-clock-offset branch from a7fb95c to 0dfa01d Compare May 15, 2021 00:35
@aggarw13 aggarw13 force-pushed the refactor/calculate-clock-offset branch from b84f41e to c06675b Compare May 17, 2021 18:52
@aggarw13 aggarw13 force-pushed the refactor/calculate-clock-offset branch from 54a76dd to 96e663d Compare May 17, 2021 23:34
Copy link

@SaswatPadhi SaswatPadhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

No more bit magic and CBMC is also happy.
Thanks for all the changes, @aggarw13!

@aggarw13 aggarw13 force-pushed the refactor/calculate-clock-offset branch from d3632fe to bcb4dfc Compare May 17, 2021 23:46
Copy link

@dan4thewin dan4thewin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The discussion of this PR yielded a compelling suggestion that may result in simpler code. I'm going to approve this now to unblock concurrent efforts, but I sincerely hope that we explore this option forthwith.

@aggarw13 aggarw13 merged commit c188823 into FreeRTOS:main May 19, 2021
@aggarw13 aggarw13 deleted the refactor/calculate-clock-offset branch May 19, 2021 00:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants