LoRaWAN: Mitigating reception issues at lower data rates & FCnt increment after retry exhaustion #8822

hasnainvirk · 2018-11-20T14:44:17Z

Description

A new algorithm is proposed and taken in use for calculation of receive window length and
timing offset in opening the said window. This algorithm performs better than the stock algorithm and consumes less power at the same time. Stock algorithm would miss packets while using low data rates as the offset calculation was not suitable enough.

Previously we had been incrementing UL frame counter for a CONFIRMED
message only when the transmission was deemed successful i.e., we would
have received an ack before all the retries would have exhausted.
Now we have opted to increment the frame counter if all the retries are
exhausted considering the fact that we essentially treat the next
message after retry exhaustion as a new packet so we should also
increment the frame counter.

Pull request type

[X] Fix
[ ] Refactor
[ ] Target update
[ ] Functionality change
[ ] Docs update
[ ] Test update
[ ] Breaking change

hasnainvirk · 2018-11-20T14:44:50Z

@AnttiKauppila Please review.

0xc0170

One styling change request

0xc0170 · 2018-11-20T15:02:24Z

features/lorawan/mbed_lib.json

        },
        "downlink-preamble-length": {
-        	"help": "Number of preamble symbols need to be captured (out of 8) for successful demodulation",
-        	"value": 5
+            "help": "Number of preamble symbols expected on receive path. Default: 8",


perfect to align, but please done in the separate commit

kjbracey · 2018-11-20T17:23:37Z

Lovely long commit message, but reading it I can't figure out what the error actually was. My understanding was that the transmitted preamble was 8 symbols, but the radio only needed 5 of the 8 to lock on to the signal.

So the attempted optimisation seemed to be to target the wake-up to be in time to get at least 5 preamble symbols - the radio doesn't need to necessarily hit the first 3.

You seem to be describing my understanding, but then say "We were wrong." but I can't see why we were.

... As the code logic isn't changed, just the number, it now seems that we think we need to capture all 8 of the symbols to lock on?

kjbracey · 2018-11-20T17:25:18Z

features/lorawan/mbed_lib.json

-        	"help": "Maximum timing error of the receiver in ms. The receiver will turn on in [-RxError : + RxError]",
-        	"value": 10
+            "help": "Maximum timing error of the receiver in ms. The receiver will turn on in [-RxError : + RxError]",
+            "value": 1


This is being done in a high-priority thread, I hope? Otherwise I wouldn't expect 1ms precision.

Wary about this defaulting to 1ms when you are just running on normal priority thread. You will need documentation like "if you're running any normal priority thread code, this may need to be increased". Same applies to deep sleep, which is documented for 10ms wakeup time - may work on certain platforms.

hasnainvirk · 2018-11-27T11:12:44Z

@0xc0170 The PR is updated. @kjbracey-arm Have updated the PR with our new algorithm as discussed in person.

hasnainvirk · 2018-11-28T09:44:35Z

@0xc0170 Bump.

hasnainvirk · 2018-11-28T09:54:55Z

@kjbracey-arm Tested the code base with Deep sleep on (UARTSerial removed), error fudge to be 1 ms and everything works pretty well.

kjbracey · 2018-11-28T10:12:10Z

features/lorawan/mbed_lib.json

        },
        "downlink-preamble-length": {
-        	"help": "Number of preamble symbols need to be captured (out of 8) for successful demodulation",
-        	"value": 5
+            "help": "Number of whole preamble symbols needed to have a firm lock on the signal. Default: 5 + 1",


Suggest not changing this - this new version is wrongly stated (you're thinking you need 5 whole symbols, hence 6 symbol times). Do the +1 internally.

The "+ 1" shouldn't be there.

kjbracey · 2018-11-28T10:15:18Z

features/lorawan/mbed_lib.json

-        	"help": "Maximum timing error of the receiver in ms. The receiver will turn on in [-RxError : + RxError]",
-        	"value": 10
+            "help": "Maximum timing error of the receiver in ms. The receiver will turn on in [-RxError : + RxError]",
+            "value": 1


Wary about this defaulting to 1ms when you are just running on normal priority thread. You will need documentation like "if you're running any normal priority thread code, this may need to be increased". Same applies to deep sleep, which is documented for 10ms wakeup time - may work on certain platforms.

kjbracey · 2018-11-28T10:15:44Z

features/lorawan/lorastack/phy/LoRaPHY.cpp

-    // Computed number of symbols
-    *window_timeout = MAX((uint32_t) ceil(((2 * min_rx_symb - 8) * t_symb + 2 * rx_error) / t_symb), min_rx_symb);
-    *window_offset = (int32_t) ceil((4.0 * t_symb) - ((*window_timeout * t_symb) / 2.0) - wakeup_time);
+    float min_rx_symbol_overlap_required = float (min_rx_symb);


Suggest doing the +1 to convert from whole symbols to symbol times here.

Although thinking again, seeing as we're carefully calculating to hit the end of the preamble, and rounding up, we know we're aiming to get the whole of the last symbol, and hence should be enclosing whole symbols anyway - we're not aiming into the middle of a sequence, so an extra +1 shouldn't be required. All error terms should have ensured we enclose the whole number. I think?

I think that's roughly correct. The +1 is the situation is landing in the middle of the preamble with effectively random phase - allow 6 symbol periods to catch 5 whole symbols. But if error is tiny, we're hitting the end quite precisely, so only need to allow 2*error+wakeup+jitter for being a bit earlier than the target, but not a whole symbol earlier. So timeout is min_rx_sym + min(1 symbol, 2*error+wakeup+jitter).

That's fine up until the point we turn on earlier than the transmitter. Then we will get its first min_rx_sym symbols, but have to wait for it to start.

To cover that the timeout is min_rx_sym + worst_case_how_long_before_it_starts, and how long is the min(0, offset - error - jitter) number in the other calculation.

So final result is min_rx_sym + max(how_long_before, min(1 symbol, 2*error+wakeup+jitter)).

kjbracey · 2018-11-28T11:00:23Z

features/lorawan/lorastack/phy/LoRaPHY.cpp

+
+    // Actual window offset in ms in response to timing error fudge factor and
+    // radio wakeup/turned around time.
+    *window_offset = floor(target_rx_window_offset - error_fudge - wakeup_time);


Question for @c1728p9 - does/will the sleep code compensate for the 10ms or whatever deep sleep latency, if a long sleep is requested?

Eg if wait(1000) or call_in(1000), do we get a wake from deep sleep at 990 then wake from shallow at 1000, so we hit 1000 precisely, or do we just schedule an interrupt at 1000 so we potentially wake at 1010?

This code in particular is trying to hit a 1 second delay for a receive window after a transmit with millisecond precision. Can that be achieved without holding a deep sleep lock?

As long as the RTX knows the wakeup time in advance (via some OS delay/wait call) then it should wakeup correctly with millisecond precision. If you are waking up using the low power ticker or a GPIO pin directly then deep sleep latency compensation won't help, as the OS doesn't know you are wanting to wake up in advance, and you'll need to hold the lock manually.

Great, that's what I hoped. I knew you'd been looking at the compensation, but couldn't recall if it was just covering "teeny" sleeps or reached out for long targets.

In this case, it's via an EventQueue::call_in, so that's a timed semaphore wake, so we're good.

This "sleep latency compensation only via the OS" thing feels like it should be documented somewhere, but not sure where to encourage people to use OS mechanisms rather than LowPowerTimers. Some sort of knowledge base article about deep sleep/tickless or something. Or maybe even a note on LowPowerTimer, which is what people might be thinking of using with deep sleep active.

@kjbracey-arm can you create a tracking issue for this one?

0xc0170 · 2018-12-06T11:48:55Z

@kjbracey-arm All your comments were addressed? Is this ready for CI?

kjbracey

Some minor nits.

It seems this needs more investigation, but this gets closer to it, so let's take this version of the algorithm for further testing.

The ASCII art in LoRaPHY.h must be updated when this is all finalised.

kjbracey · 2018-12-07T09:31:18Z

features/lorawan/mbed_lib.json

        },
        "downlink-preamble-length": {
-        	"help": "Number of preamble symbols need to be captured (out of 8) for successful demodulation",
-        	"value": 5
+            "help": "Number of whole preamble symbols needed to have a firm lock on the signal. Default: 5 + 1",


The "+ 1" shouldn't be there.

kjbracey · 2018-12-07T09:32:03Z

features/lorawan/system/lorawan_data_structures.h

@@ -45,7 +45,7 @@ typedef uint32_t lorawan_time_t;
 #endif

 // Radio wake-up time from sleep - unit ms.
-#define RADIO_WAKEUP_TIME                           1
+#define RADIO_WAKEUP_TIME                           1.0


Needs a f if it's going to be here, to avoid double.

kjbracey · 2018-12-07T09:35:11Z

features/lorawan/lorastack/phy/LoRaPHY.cpp

@@ -32,8 +32,9 @@ SPDX-License-Identifier: BSD-3-Clause
 #define BACKOFF_DC_1_HOUR       100
 #define BACKOFF_DC_10_HOURS     1000
 #define BACKOFF_DC_24_HOURS     10000
-
-#define CHANNELS_IN_MASK  16
+#define MAX_PREAMBLE_LENGTH     8.0


More f suffixes needed (if they need to be float at all)

kjbracey · 2018-12-07T09:35:40Z

features/lorawan/lorastack/phy/LoRaPHY.cpp

 {
-    return (8.0 / (double) phy_dr); // 1 symbol equals 1 byte
+    return (8.0 / (float) phy_dr); // 1 symbol equals 1 byte


hasnainvirk · 2018-12-07T10:49:01Z

@kjbracey-arm Please review again,

cmonr · 2018-12-07T19:00:47Z

CI started

mbed-ci · 2018-12-07T19:07:30Z

Test run: FAILED

Summary: 1 of 1 test jobs failed
Build number : 1
Build artifacts

Failed test jobs:

jenkins-ci/mbed-os-ci_unittests

cmonr · 2018-12-07T20:39:36Z

CI restarted, but I suspect this is an issue with the PR.

mbed-ci · 2018-12-07T20:46:04Z

Test run: FAILED

Summary: 1 of 1 test jobs failed
Build number : 2
Build artifacts

Failed test jobs:

jenkins-ci/mbed-os-ci_unittests

cmonr · 2018-12-07T20:51:05Z

Ooph, that was quick.
And unittests don't upload their logs yet. That's a shame.

@hasnainvirk Please refer to the attached CI log.
consoleText.log

Previously we had been incrementing UL frame counter for a CONFIRMED message only when the transmission was deemed successful i.e., we would have received an ack before all the retries would have exhausted. Now we have opted to increment the frame counter if all the retries are exhausted considering the fact that we essentially treat the next message after retry exhaustion as a new packet so we should also increment the frame counter.

A new algorithm has been taken in use to calculate the receive window length and the timing offset involved in opening of the said receive window. This algorithm performs better than the stock algorthm and consumes less power.

Minor style alignment.

Unittest stubs needed to be updated for the API change. Although this API is private and internal to LoRaPHY, somehow it appeared in the LoRaPHYStub. Updating it for consistency.

In this version we try to mitigate a situation when we start listening right in the middle of a preamble sequence (e.g., in high SF case).

Final code cleanup and adding ascii art for the version 2 of the algorithm.

A missing macro definition is added to the unit-test cmake file for LoRaPHY class.

hasnainvirk · 2018-12-10T06:55:56Z

@cmonr Whoops. Unittest framework doesn't take any of the Mbed OS config files in consideration. Wasn't aware of that. unittest_cmake file contains manual definition if any needed. Fixed it in the latest commit.

cmonr · 2018-12-10T20:32:50Z

CI started

mbed-ci · 2018-12-11T01:24:19Z

Test run: SUCCESS

Summary: 11 of 11 test jobs passed
Build number : 3
Build artifacts

0xc0170 requested a review from AnttiKauppila November 20, 2018 14:47

0xc0170 added the needs: review label Nov 20, 2018

AnttiKauppila approved these changes Nov 20, 2018

View reviewed changes

0xc0170 requested changes Nov 20, 2018

View reviewed changes

kjbracey reviewed Nov 20, 2018

View reviewed changes

0xc0170 added needs: work and removed needs: review labels Nov 21, 2018

mattbrown015 mentioned this pull request Nov 22, 2018

LoRaWAN: Uplink Frame Counter Not Incremented After Last Retry of Confirmed Message #8720

Closed

cmonr added the release-version: 5.11.1 label Nov 26, 2018

hasnainvirk force-pushed the dr0_and_fcnt_issue branch 4 times, most recently from 6f72ccd to d4949ea Compare November 27, 2018 11:10

0xc0170 approved these changes Nov 28, 2018

View reviewed changes

0xc0170 added needs: review and removed needs: work labels Nov 28, 2018

kjbracey reviewed Nov 28, 2018

View reviewed changes

0xc0170 added needs: work and removed needs: review labels Nov 28, 2018

cmonr added needs: review and removed needs: work labels Nov 29, 2018

hasnainvirk force-pushed the dr0_and_fcnt_issue branch 3 times, most recently from ab09a8f to d52861d Compare December 3, 2018 15:07

kjbracey reviewed Dec 7, 2018

View reviewed changes

hasnainvirk force-pushed the dr0_and_fcnt_issue branch from d52861d to 9193d37 Compare December 7, 2018 10:48

kjbracey approved these changes Dec 7, 2018

View reviewed changes

cmonr added needs: CI and removed needs: review labels Dec 7, 2018

cmonr added needs: work and removed needs: CI labels Dec 7, 2018

Hasnain Virk added 7 commits December 10, 2018 08:53

Mitigating reception problems with lower data rates

88490fb

A new algorithm has been taken in use to calculate the receive window length and the timing offset involved in opening of the said receive window. This algorithm performs better than the stock algorthm and consumes less power.

Style alignment

3ec643f

Minor style alignment.

Updating get_rx_window_params() API in stubs

ca08367

Unittest stubs needed to be updated for the API change. Although this API is private and internal to LoRaPHY, somehow it appeared in the LoRaPHYStub. Updating it for consistency.

RX window calculation algorithm version 2

5170daa

In this version we try to mitigate a situation when we start listening right in the middle of a preamble sequence (e.g., in high SF case).

Final cleanup and ASCII art for algorithm v2

e1e48b4

Final code cleanup and adding ascii art for the version 2 of the algorithm.

Unit test fix

138fd74

A missing macro definition is added to the unit-test cmake file for LoRaPHY class.

hasnainvirk force-pushed the dr0_and_fcnt_issue branch from 9193d37 to 138fd74 Compare December 10, 2018 06:53

cmonr added needs: CI and removed needs: work labels Dec 10, 2018

cmonr added ready for merge and removed needs: CI labels Dec 11, 2018

cmonr merged commit 3875ac1 into ARMmbed:master Dec 11, 2018

cmonr removed the ready for merge label Dec 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LoRaWAN: Mitigating reception issues at lower data rates & FCnt increment after retry exhaustion #8822

LoRaWAN: Mitigating reception issues at lower data rates & FCnt increment after retry exhaustion #8822

hasnainvirk commented Nov 20, 2018 •

edited

Loading

hasnainvirk commented Nov 20, 2018

0xc0170 left a comment

0xc0170 Nov 20, 2018

kjbracey commented Nov 20, 2018 •

edited

Loading

kjbracey Nov 20, 2018

kjbracey Nov 28, 2018

hasnainvirk commented Nov 27, 2018

hasnainvirk commented Nov 28, 2018

hasnainvirk commented Nov 28, 2018

kjbracey Nov 28, 2018

kjbracey Dec 7, 2018

kjbracey Nov 28, 2018

kjbracey Nov 28, 2018

kjbracey Nov 28, 2018 •

edited

Loading

kjbracey Nov 28, 2018

c1728p9 Nov 28, 2018

kjbracey Nov 29, 2018

0xc0170 Nov 29, 2018

0xc0170 commented Dec 6, 2018

kjbracey left a comment

kjbracey Dec 7, 2018

kjbracey Dec 7, 2018

kjbracey Dec 7, 2018

kjbracey Dec 7, 2018

hasnainvirk commented Dec 7, 2018

cmonr commented Dec 7, 2018

mbed-ci commented Dec 7, 2018

cmonr commented Dec 7, 2018 •

edited

Loading

mbed-ci commented Dec 7, 2018

cmonr commented Dec 7, 2018

hasnainvirk commented Dec 10, 2018

cmonr commented Dec 10, 2018

mbed-ci commented Dec 11, 2018

LoRaWAN: Mitigating reception issues at lower data rates & FCnt increment after retry exhaustion #8822

LoRaWAN: Mitigating reception issues at lower data rates & FCnt increment after retry exhaustion #8822

Conversation

hasnainvirk commented Nov 20, 2018 • edited Loading

Description

Pull request type

hasnainvirk commented Nov 20, 2018

0xc0170 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kjbracey commented Nov 20, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hasnainvirk commented Nov 27, 2018

hasnainvirk commented Nov 28, 2018

hasnainvirk commented Nov 28, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kjbracey Nov 28, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

0xc0170 commented Dec 6, 2018

kjbracey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hasnainvirk commented Dec 7, 2018

cmonr commented Dec 7, 2018

mbed-ci commented Dec 7, 2018

cmonr commented Dec 7, 2018 • edited Loading

mbed-ci commented Dec 7, 2018

cmonr commented Dec 7, 2018

hasnainvirk commented Dec 10, 2018

cmonr commented Dec 10, 2018

mbed-ci commented Dec 11, 2018

hasnainvirk commented Nov 20, 2018 •

edited

Loading

kjbracey commented Nov 20, 2018 •

edited

Loading

kjbracey Nov 28, 2018 •

edited

Loading

cmonr commented Dec 7, 2018 •

edited

Loading