lwIP is used for creation of a network interface of YAPicoprobe to avoid an uncountable number of additional CDC COM ports.
Currently a Segger SysView server is listening on port 19111. But I guess there is more to come.
lwIP had some traps waiting for me. Despite reading the official Common Pitfalls I took some of them.
Do not use a random MAC! At least not for the first byte.
I was "lucky" choosing one with an odd first byte. Unfortunately
bit 0 marks a group address. At least Linux rejects such MAC
addresses with a more or less useless error message in the syslog.
Took a while until I found out that the MAC address was the culprit
for no communication.
Finally I have used 0xfe as the first byte and the remaining five
bytes of the MAC address are copied from the last bytes of Picos serial number
(which actually is the serial number of the external flash).
really take it serious what they are writing about "raw" APIs
and only using the TCPIP thread for any calls to it.
Use tcpip_callback_with_block(<func>, <void-arg>, 0)
for
non-blocking invocations of some function. This also saves you
from creating extra simple threads for communication tasks line
in the TineUSB/lwIP glue code. That was my first idea and it took
a while until I found out how bad that idea was.
Same was true for the thread(!) which stuffed data for SysView into
it’s server: bad idea!
Effect of wrong API handling were random crashes or connection
disruptions.
To be honest, I’m very confused about the system behaviour of RNDIS/ECM/NCM.
Sometimes the host gets "disconnected", because the DNS in /etc/resolv.conf
are changed, sometimes not. Sometimes the probe needs dhclient
to get
the TCP/IP connection, sometimes not. Sometimes the probe has an IPv6 address, sometimes
not. And this all just on my Linux host. Interoperation with Windows
makes things even worse.
And /etc/network/interfaces
generates error
messages even when the device has allow-hotplug
.
-
RNDIS: this was my former favorite, because it is supported by all relevant OSs. Also throughput seemed to be good. Unfortunately RNDIS seems to manipulate routing in a way that the default route on my Linux wants to go through the probe. Not really what I want…
Note
|
RNDIS on Win10 works only, if RNDIS on the probe is the only USB class selected.
So it is possible to create a special probe firmware which provides only features
like SystemView, but you cannot have a probe which does SystemView and CMSIS-DAP. |
-
ECM: works good, packets are transferred continuously, throughput also seems to be ok. So this is the way to go.
Unfortunately there is no driver integrated into Win10, so possible extra trouble appears. Yes… extra trouble: cannot find a driver for Win10. -
NCM: is said to be the best choice. And in fact it is. At least after creation of a
ncm_device_simple.c
driver which is a stripped down version ofncm_simple.c
which revealed as very buggy.
Now thoughput under Linux and Windows is ok. Operation with SystemView works without glitches,iperf
tests sometimes crashes the probe. So consider this driver as beta and work in progress.
To measure performance, iperf
is used (which implies, that OPT_NET_IPERF_SERVER
must be set on built). Good command line for measurement:
iperf -c 192.168.14.1 -e -i 1 -l 1024
Good test cases are the following command lines:
for MSS in 90 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1459 1460 1500; do iperf -c 192.168.14.1 -e -i 1 -l 1024 -M $MSS; sleep 10; done
for LEN in 40000 16000 1400 1024 800 200 80 22 8 2 1; do for MSS in 90 93 100 150 200 255 256 300 400 500 511 512 600 700 800 900 1000 1100 1200 1300 1400 1450 1459 1460 1500; do iperf -c 192.168.14.1 -e -i 1 -l $LEN -M $MSS; sleep 2; done; done
Monitor performance/errors with Wireshark.
I’m really trying hard to switch context between lwIP and TinyUSB correctly. This leads to some kind of delayed call chains and also does not make the code neither nice nor very much maintainable.
TinyUSB NCM driver implementations is more or less buggy, so I’m doing my best implementing a driver on my own.
Current work consists of debug versions of ncm_device
and an almost
working ncm_device_simple
.
Link:
ncm_device_simple.c
is actually a mixture of ecm_rndis_device.c
and ncm_device.c
.
From ecm_rndis_device.c
the structure has been inherited and from ncm_device.c
the
functionality.
The driver can be considered work in progress, because in conjunction with iperf
crashes sometimes happen. But for operation with SystemView quality seems to be good enough.
Warning
|
There must be a nasty bug in ncm_simple_device . It reveals on startup
because startup time with ncm_device_simple is much longer compared to ECM/RNDIS and even
ncm_device .
|
This is more or less obsoleted by ncm_device_simple.c
. But as a short summary: the original
driver is very buggy. Perhaps it is working in certain scenarios, but for sure not together with
SystemView.
-
not sure, but perhaps it is best to call all functions within ncm_device in the FreeRTOS context of TinyUSB
-
wNtbOutMaxDatagrams
must be set to 1 [2023-06-27]-
iperf runs then
-
Systemview still has problems
-
wNtbOutMaxDatagrams == 0
generates a lot of retries with iperf
-
-
I guess that the major problem lies within handle_incoming_datagram() because it changes values on an incoming packet although tud_network_recv_renew() is still handling the old one
-
is there multicore a problem!? (14.7.2023: no!) I have seen retries with multicore even with
wNtbOutMaxDatagrams = 1
-
I think it is assumed, that TinyUSB and lwIP are running in the same task (but in my scenario they don’t)
-
if removing debug messages, then the receive path seems to work better, which indicates a race condition somewhere
There is an open issue in the TinyUSB repo for this issue: hathach/tinyusb#2068
The following API is for calls from TinyUSB to the driver. The calls are all initiated from within the TinyUSB stack. Thus all are done in the context of TinyUSB.
Name | Comment |
---|---|
netd_init() |
Initialization of the driver on startup. Called several times. |
netd_reset(rhport) |
Called several times on startup. |
netd_open(rhport, *itf_desc, max_len) |
Connects the USB endpoints. This is called once when the host driver connects with the device. |
netd_control_xfer_cb(rhport, stage, *request) |
called after |
netd_xfer_cb(rhport, ep_addr, result, xferred_bytes) |
Depending on
|
The driver has a whole bunch of available API calls. The most important are:
Name | Comment |
---|---|
tud_control_status() |
Send STATUS (zero length) packet. Called in |
tud_control_xfer() |
Carry out Data and Status stage of control transfer. Called in |
usbd_edpt_busy() |
Check whether an endpoint is busy or ready for the next |
usbd_edpt_open() |
Used during |
usbd_edpt_xfer() |
Submit a USB transfer. For receive operation, the specified buffer must be empty.
For transmit operation, the buffer may not be touched, until the corresponding
|
usbd_open_edpt_pair() |
Used during |
The following API is for call from glue logic to the driver. The glue logic tries hard to issue the calls in the TinyUSB context as well. But this is not guaranteed I’m afraid (other developers).
Name | Comment |
---|---|
tud_network_can_xmit(size) |
check if the driver buffer allows another datagram with the specified size.
If the driver tells the glue logic that there is space enough for the datagram, the glue logic
calls in the next step |
tud_network_link_state_cb(state) |
seems to be obsolete. No call found within the stack. So do not implement. PR at TinyUSB pending to remove the call. |
tud_network_recv_renew() |
Called when the glue logic has the opinion that the driver should check if it
can enable receive logic. The function has to check, if the USB channel
and receive buffer are available. Another option (for NCM) is, that there are
still buffered datagrams which can then be transferred via |
tud_network_xmit(*ref, arg) |
The glue logic requests a datagram transfer into the driver. The driver may then
prepare for the actual copy operation from glue logic which is performed via
|
The glue logic also provides some API which has to be used by the driver. The driver always calls the glue logic in the TinyUSB context.
Name | Comment |
---|---|
tud_network_recv_cb(*src, size) |
Transfer a single datagram from the driver to the glue logic. When the layer above the glue logic (lwIP)
has handled the datagram, it issues a |
tud_network_xmit_cb(*dst, *ref, arg) |
The driver calls this function from |
The following table holds a comparison between the several network drivers. The first seven bars are
created with
for MSS in 100 200 400 800 1200 1450 1500; do iperf -c 192.168.14.1 -e -i 1 -M $MSS -l 8192 -P 1; sleep 2; done
.
After that SystemView is started with almost maximum load (~85000 events/s, 325 KByte/s) and after a while
iperf is started again in parallel.
The images are recorded with Wireshark. The red bars are "TCP Window Full" if not otherwise noted.
Driver | |
---|---|
ECM |
|
RNDIS |
|
Original NCM |
|
Simple NCM |
|
New NCM |
So obviously ncm_device_new
is the clear winner: best in performance, best in functionality, best in compatibility.
What else is needed?
-
for unknown reasons the probe is even with ECM in stutter mode, don’t know why, that worked before smoothly. Transfer rate is bad
-
systemview test program (NoOS) on the target:
-
it already worked with around 10000 events/s, now the limit is ~3000
-
if there is a SysTick ISR then SystemView is completely messed up. Checked that locking is included. Seems to be so.
-
-
after some changes to
rtt_console.c
,net_sysview.c
andnet_glue.c
ECM is working again as expected
-
for debugging purposes reimplemented
ncm_device_simple.c
which can hold only one ethernet frame per NTB (NCM Transfer Block). This unfortunately requires that the originalncm_device.c
must be outcommented via#if
on top.
-
did some performance tuning with lwIP and TinyUSB
-
stripped sources
-
BUG:
ncm_device_simple
sometimes crashes withiperf
-
BUG: with
ncm_device_simple
startup time of the probe is much longer compared to ECM/RNDIS or evenncm_device
. With startup time I mean the time until there is something visible on the probes debug output. For ECM/RNDIS/ncm_device this is almost instantly, withncm_device_simple
it takes ~10s!
→ reverted toncm_device
because SystemView runs without problems with it
→ solved withncm_device_new
-
new driver:
ncm_device_new
-
works (better then
ncm_device_simple
), but-
✓ problems, if
wNtbOutMaxDatagrams!=1
→ see comment -
✓ iperf also shows problems if
-P
is > 1. I guess this is an iperf problem, because iperf and SystemView are running parallel without such errors -
❏ surprisingly
iperf
performance is much better with actual firmware.cmake-create-debugEE
has just half performance -
✓ but all these problems also exists with
ncm_device
. Is it in the glue code? Possible, because the effect is also with ECM driver. No, not in the glue code, because iperf and SystemView work in parallel -
✓ packets/s is changing heavily, setting
wNtbOutMaxDatagrams==0
helped to prevent raising of packet rate (sometimes there are two datagrams in one NTB even with SystemView)
-
-
how to continue?
-
✓ need a test case where
tud_network_can_xmit()
collects datagrams. Currently there is always only one active xmit datagram, perhapsiperf
with-P 4
does it.
→ this all happens under load testing -
✓ check if there is a problem in the glue code for datagram reception. Glue buffer freed too soon? No, I doubt it. But examples are few.
-
-