-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Microsecond round up code doesn't round up. #12
Comments
OK, I reviewed this code again, and I concur that the comment is not quite accurate. It is simply rounding to the nearest value, not necessarily "up", and the result can be zero if there is less than 0.5us per tick or >2M ticks per second. That being said, tick rates that fast are arguably very impractical, and it is pretty unusual to have a system with a tick rate faster than 1ms (1000us period or 1000 ticks per second) in practice. OSAL has always historically done its time calculations using this "micro seconds per tick" value, so if we actually do need to support systems that actually have a tick rate of less than 1us per tick, then the whole calculation needs to change to support the higher resolution. This I would think would be a requirements-driven change. Note that even though the comment (erroneously) says that the result cannot be zero, this is actually enforced later before OS_API_Init() returns, i.e.
Therefore I think the code is totally safe, the result is guaranteed to not be zero, just not through the calculation itself but rather a check/verification afterward. The only change here might be to rephrase the comment to better reflect what actually happens. |
Thanks for reviewing. I recommend simplifying the code also since the complexity doesn't do anything. Concur with updating the comment (and like worth noting in the comment that zero gets checked later). Likely worth documenting somewhere at a top design level this current limitation (1000 Hz). I wonder if the drone folks ran into this, or anyone else trying to run high rate control systems. |
Note there is a factor of 1000 difference here. The "limitation" is not at 1000Hz, but 1000000Hz (1 MHz) tick frequency, or 1us per tick period. The major thing to note is that since OSAL does delay calculations based on "micro seconds per tick" that anything which is not a whole/integer number of microseconds per tick will be subject to rounding errors. There is no way around this. But at rats <= 1000Hz the rounding errors will be relatively small and probably not noticeable. But at higher rates, it could become more evident (i.e. in the extreme case, if a calculation used 2usec/tick instead of 1.5usec/tick, the delay would be 33% longer than expected). |
Can you clarify? The "rounding" action in the code is useful, I think, to minimize the impact of tick rates that don't equate to an integer number of microseconds per tick. For instance, again in an extreme case, if the period is 1.9 us/tick then the value will become 2, rather than 1, so at least the delay calculation will be closer to the real value. |
Ensuring the configured operating system time per clock tick is a common denominator of all periods is a fundamental and important thing to do during system integration. It is critical that mathematical slop from mismatches and rounding be avoided. In general delays and operations measured in ticks are guaranteed to be a minimum of the number of ticks for most operating system API specifications. This usually means that that a delay for n ticks can be up to slightly less than n+1 ticks.
I recommend that when using RTEMS, people configure the tick quantum so all the map work out for their delays and periods.
Given that this code should be being used in systems where this mathematical and rounding error could be critical, it might be worth it to add a debug mode which flags when the math works out badly.
--joel
On Sep 19, 2019, 5:48 PM, at 5:48 PM, Joseph Hickey <[email protected]> wrote:
recommend simplifying the code also since the complexity doesn't do
anything.
Can you clarify? The "rounding" action in the code is useful, I think,
to minimize the impact of tick rates that don't equate to an integer
number of microseconds per tick. For instance, again in an extreme
case, if the period is 1.9 us/tick then the value will become 2, rather
than 1, so at least the delay calculation will be closer to the real
value.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
@jphickey - This code doesn't do that. 1.9 does not become 2, it becomes 1 with or without this extra logic attempting to round up. 1.1 converts to 1, 1.9 converts to 1. |
@joelsherrill - agreed, debug reporting when a non-exact conversion is made is a great idea. |
@skliper -- are we talking about the same code? I'm looking at this line:
If If you simply compute If you compute the value as coded, you get We could warn about an inexact conversion to micro seconds per tick, but I'm not really sure what it buys you. And it would only show up when debug is enabled and you are looking at the console. It certainly isn't something I would advocate returning OS_ERROR about (which is basically a panic) Edit: Corrected a typo in my original comment. Logic is the same. |
Another Note: Upon further review it is worth mentioning that this For The only thing this is used for is to output a normalized value for the |
@jphickey I think we are both saying the same thing. Any values > 2M return zero. By "doesn't do anything" I should have stated instead "doesn't guarantee non-zero", or "doesn't do anything stated in rational or a requirement". If the code doesn't do what the comment states (guarantee non-zero), I'm curious why the code bothers to round up at all. What is the required behavior, vs just implementing a round up to round up. Is rounding up better than truncating? Both answers aren't exact, which is why I also concur with the request to add user notification if the result isn't exact. A debug notification is better than nothing, and shows intent. If the rational is to round up such that the reported accuracy bounds the actual accuracy I'd consider that sufficient as long as it gets added to the comment. But without rational or a requirement it's really not obvious to me why a round up is necessary. |
@jphickey oh, I see now my example wasn't a good one illustrating my concern. I should have said 0.9 becomes 0, in that it doesn't round up to avoid zero (which I guess the comment doesn't say exactly, but it does guarantee non-zero). |
@skliper As I said earlier, I do concur that the code comment is incorrect/inaccurate. It is not a round "up" specifically, it is rounding it to the nearest integer, which could be up or down, in an attempt to produce the closest integer approximation of the real value. The value of 0.9 will get rounded to 1, not 0. (i.e. any fractional part greater than or equal to 0.5 goes to the next higher integer, whereas any fractional part less than 0.5 goes to the next lower integer). I'd like to summarize the discussion here with some type of action item, if possible. I self assigned this ticket but I'm still not entirely sure what I'm changing here, aside from the comment itself. To go back to requirements, perhaps there is one in an internal requirements document somewhere?
Likewise, For OS_Tick2Micros, the API document just says it returns "microseconds per operating tick". In none of these cases does it definitively say what will happen if the operating system tick period is not an exact integer number of microseconds, nor does it even say it cannot be zero. So we don't have a specific "requirement" here to follow.
Here is what I propose as a potential resolution to this issue:
Please confirm if this proposed change will satisfy this concern? |
@jphickey casting 0.9 to an integer = 0. That's all I'm saying. I agree the logic rounds to the nearest integer, and that behavior is fine as long as it's obvious as to why. Since we don't have a requirement, with the comment update I'd prefer
OS_DEBUG action is perfect. Would it be helpful to update the OS_TimerCreate clock_accuracy description (in the header and API doc) to state if needed this value is rounded to the nearest integer (vs floor or ceiling to bound accuracy)? *updated to clarify logic statement (vs "it") |
Updated OS_TimerCreate API documentation to reflect the comments above as part of #364 (partial resolution). Still pending OS_DEBUG action, and the two comment updates noted above. EDIT - added related issue |
@dmknutsen - want to finish the rest of this one? Comments update and debug message. |
Apologies...just realized I missed this. Sure, if it is still open...go ahead and assign it to me. |
Fix #12, Comment update to correct for microseconds not always rounding up + a…
Describe the bug
Spawned from #1
The code comment claims it rounds up to never return zero. The formula implemented doesn’t actually round up in all cases, since generally when casting a float/double to an int you lose the fractional part (truncation, not rounding). So the code is not self-consistent. It’s not a POSIX or OS issue, it’s that the code doesn’t do what it says it does. The API document doesn’t specify a non-zero guarantee.
osal/src/os/posix/ostimer.c
Lines 284 to 290 in bfa7a33
Similar misleading comment at:
osal/src/os/posix/ostimer.c
Lines 231 to 232 in bfa7a33
For what it’s worth, on Linux (our Ubuntu dev system) this code reports 100 ticks per second, and 10000 usec per tick. But if you pass in high values for ticks per second, it does return zero when it claims to round up (try 2000000 ticks per second).
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Expected code to match comment, round up to not equal zero. Algorithm doesn't work as claimed in comment.
System observed on:
Additional context
Add any other context about the problem here.
Reporter Info
Jacob Hageman/NASA-GSFC
The text was updated successfully, but these errors were encountered: