-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LoRaWAN: Mitigating reception issues at lower data rates & FCnt increment after retry exhaustion #8822
Conversation
@AnttiKauppila Please review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One styling change request
features/lorawan/mbed_lib.json
Outdated
}, | ||
"downlink-preamble-length": { | ||
"help": "Number of preamble symbols need to be captured (out of 8) for successful demodulation", | ||
"value": 5 | ||
"help": "Number of preamble symbols expected on receive path. Default: 8", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perfect to align, but please done in the separate commit
Lovely long commit message, but reading it I can't figure out what the error actually was. My understanding was that the transmitted preamble was 8 symbols, but the radio only needed 5 of the 8 to lock on to the signal. So the attempted optimisation seemed to be to target the wake-up to be in time to get at least 5 preamble symbols - the radio doesn't need to necessarily hit the first 3. You seem to be describing my understanding, but then say "We were wrong." but I can't see why we were. ... As the code logic isn't changed, just the number, it now seems that we think we need to capture all 8 of the symbols to lock on? |
features/lorawan/mbed_lib.json
Outdated
"help": "Maximum timing error of the receiver in ms. The receiver will turn on in [-RxError : + RxError]", | ||
"value": 10 | ||
"help": "Maximum timing error of the receiver in ms. The receiver will turn on in [-RxError : + RxError]", | ||
"value": 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is being done in a high-priority thread, I hope? Otherwise I wouldn't expect 1ms precision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wary about this defaulting to 1ms when you are just running on normal priority thread. You will need documentation like "if you're running any normal priority thread code, this may need to be increased". Same applies to deep sleep, which is documented for 10ms wakeup time - may work on certain platforms.
6f72ccd
to
d4949ea
Compare
@0xc0170 The PR is updated. @kjbracey-arm Have updated the PR with our new algorithm as discussed in person. |
@0xc0170 Bump. |
@kjbracey-arm Tested the code base with Deep sleep on (UARTSerial removed), error fudge to be 1 ms and everything works pretty well. |
features/lorawan/mbed_lib.json
Outdated
}, | ||
"downlink-preamble-length": { | ||
"help": "Number of preamble symbols need to be captured (out of 8) for successful demodulation", | ||
"value": 5 | ||
"help": "Number of whole preamble symbols needed to have a firm lock on the signal. Default: 5 + 1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest not changing this - this new version is wrongly stated (you're thinking you need 5 whole symbols, hence 6 symbol times). Do the +1 internally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "+ 1" shouldn't be there.
features/lorawan/mbed_lib.json
Outdated
"help": "Maximum timing error of the receiver in ms. The receiver will turn on in [-RxError : + RxError]", | ||
"value": 10 | ||
"help": "Maximum timing error of the receiver in ms. The receiver will turn on in [-RxError : + RxError]", | ||
"value": 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wary about this defaulting to 1ms when you are just running on normal priority thread. You will need documentation like "if you're running any normal priority thread code, this may need to be increased". Same applies to deep sleep, which is documented for 10ms wakeup time - may work on certain platforms.
// Computed number of symbols | ||
*window_timeout = MAX((uint32_t) ceil(((2 * min_rx_symb - 8) * t_symb + 2 * rx_error) / t_symb), min_rx_symb); | ||
*window_offset = (int32_t) ceil((4.0 * t_symb) - ((*window_timeout * t_symb) / 2.0) - wakeup_time); | ||
float min_rx_symbol_overlap_required = float (min_rx_symb); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest doing the +1 to convert from whole symbols to symbol times here.
Although thinking again, seeing as we're carefully calculating to hit the end of the preamble, and rounding up, we know we're aiming to get the whole of the last symbol, and hence should be enclosing whole symbols anyway - we're not aiming into the middle of a sequence, so an extra +1 shouldn't be required. All error terms should have ensured we enclose the whole number. I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's roughly correct. The +1 is the situation is landing in the middle of the preamble with effectively random phase - allow 6 symbol periods to catch 5 whole symbols. But if error is tiny, we're hitting the end quite precisely, so only need to allow 2*error+wakeup+jitter for being a bit earlier than the target, but not a whole symbol earlier. So timeout is min_rx_sym + min(1 symbol, 2*error+wakeup+jitter)
.
That's fine up until the point we turn on earlier than the transmitter. Then we will get its first min_rx_sym
symbols, but have to wait for it to start.
To cover that the timeout is min_rx_sym + worst_case_how_long_before_it_starts
, and how long is the min(0, offset - error - jitter)
number in the other calculation.
So final result is min_rx_sym + max(how_long_before, min(1 symbol, 2*error+wakeup+jitter))
.
|
||
// Actual window offset in ms in response to timing error fudge factor and | ||
// radio wakeup/turned around time. | ||
*window_offset = floor(target_rx_window_offset - error_fudge - wakeup_time); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question for @c1728p9 - does/will the sleep code compensate for the 10ms or whatever deep sleep latency, if a long sleep is requested?
Eg if wait(1000)
or call_in(1000)
, do we get a wake from deep sleep at 990 then wake from shallow at 1000, so we hit 1000 precisely, or do we just schedule an interrupt at 1000 so we potentially wake at 1010?
This code in particular is trying to hit a 1 second delay for a receive window after a transmit with millisecond precision. Can that be achieved without holding a deep sleep lock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as the RTX knows the wakeup time in advance (via some OS delay/wait call) then it should wakeup correctly with millisecond precision. If you are waking up using the low power ticker or a GPIO pin directly then deep sleep latency compensation won't help, as the OS doesn't know you are wanting to wake up in advance, and you'll need to hold the lock manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, that's what I hoped. I knew you'd been looking at the compensation, but couldn't recall if it was just covering "teeny" sleeps or reached out for long targets.
In this case, it's via an EventQueue::call_in
, so that's a timed semaphore wake, so we're good.
This "sleep latency compensation only via the OS" thing feels like it should be documented somewhere, but not sure where to encourage people to use OS mechanisms rather than LowPowerTimer
s. Some sort of knowledge base article about deep sleep/tickless or something. Or maybe even a note on LowPowerTimer
, which is what people might be thinking of using with deep sleep active.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kjbracey-arm can you create a tracking issue for this one?
ab09a8f
to
d52861d
Compare
@kjbracey-arm All your comments were addressed? Is this ready for CI? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor nits.
It seems this needs more investigation, but this gets closer to it, so let's take this version of the algorithm for further testing.
The ASCII art in LoRaPHY.h must be updated when this is all finalised.
features/lorawan/mbed_lib.json
Outdated
}, | ||
"downlink-preamble-length": { | ||
"help": "Number of preamble symbols need to be captured (out of 8) for successful demodulation", | ||
"value": 5 | ||
"help": "Number of whole preamble symbols needed to have a firm lock on the signal. Default: 5 + 1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "+ 1" shouldn't be there.
@@ -45,7 +45,7 @@ typedef uint32_t lorawan_time_t; | |||
#endif | |||
|
|||
// Radio wake-up time from sleep - unit ms. | |||
#define RADIO_WAKEUP_TIME 1 | |||
#define RADIO_WAKEUP_TIME 1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a f
if it's going to be here, to avoid double.
@@ -32,8 +32,9 @@ SPDX-License-Identifier: BSD-3-Clause | |||
#define BACKOFF_DC_1_HOUR 100 | |||
#define BACKOFF_DC_10_HOURS 1000 | |||
#define BACKOFF_DC_24_HOURS 10000 | |||
|
|||
#define CHANNELS_IN_MASK 16 | |||
#define MAX_PREAMBLE_LENGTH 8.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More f
suffixes needed (if they need to be float at all)
{ | ||
return (8.0 / (double) phy_dr); // 1 symbol equals 1 byte | ||
return (8.0 / (float) phy_dr); // 1 symbol equals 1 byte |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
8.0f
d52861d
to
9193d37
Compare
@kjbracey-arm Please review again, |
CI started |
Test run: FAILEDSummary: 1 of 1 test jobs failed Failed test jobs:
|
CI restarted, but I suspect this is an issue with the PR. |
Test run: FAILEDSummary: 1 of 1 test jobs failed Failed test jobs:
|
Ooph, that was quick. @hasnainvirk Please refer to the attached CI log. |
Previously we had been incrementing UL frame counter for a CONFIRMED message only when the transmission was deemed successful i.e., we would have received an ack before all the retries would have exhausted. Now we have opted to increment the frame counter if all the retries are exhausted considering the fact that we essentially treat the next message after retry exhaustion as a new packet so we should also increment the frame counter.
A new algorithm has been taken in use to calculate the receive window length and the timing offset involved in opening of the said receive window. This algorithm performs better than the stock algorthm and consumes less power.
Minor style alignment.
Unittest stubs needed to be updated for the API change. Although this API is private and internal to LoRaPHY, somehow it appeared in the LoRaPHYStub. Updating it for consistency.
In this version we try to mitigate a situation when we start listening right in the middle of a preamble sequence (e.g., in high SF case).
Final code cleanup and adding ascii art for the version 2 of the algorithm.
A missing macro definition is added to the unit-test cmake file for LoRaPHY class.
9193d37
to
138fd74
Compare
@cmonr Whoops. Unittest framework doesn't take any of the Mbed OS config files in consideration. Wasn't aware of that. unittest_cmake file contains manual definition if any needed. Fixed it in the latest commit. |
CI started |
Test run: SUCCESSSummary: 11 of 11 test jobs passed |
Description
A new algorithm is proposed and taken in use for calculation of receive window length and
timing offset in opening the said window. This algorithm performs better than the stock algorithm and consumes less power at the same time. Stock algorithm would miss packets while using low data rates as the offset calculation was not suitable enough.
Previously we had been incrementing UL frame counter for a CONFIRMED
message only when the transmission was deemed successful i.e., we would
have received an ack before all the retries would have exhausted.
Now we have opted to increment the frame counter if all the retries are
exhausted considering the fact that we essentially treat the next
message after retry exhaustion as a new packet so we should also
increment the frame counter.
Pull request type