-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESP32: Port net module to netconn API. #2047
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks great, just a handful of comments and questions. Looks like using netconn
simplifies things in this module, which is a nice change. Dare I ask about the whole PLUS / MINUS thing of the events? Or am I better off not knowing? :)
components/modules/net.c
Outdated
{ | ||
lnet_event *ev = (lnet_event *)malloc (sizeof (lnet_event) + p->len); | ||
lnet_event *ev = (lnet_event *)malloc (sizeof (lnet_event) /*+ p->len*/); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nuke the + p->len
completely? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course!
|
||
switch (type) { | ||
case TYPE_TCP_CLIENT: | ||
ud->client.cb_connect_ref = LUA_NOREF; | ||
ud->client.cb_reconnect_ref = LUA_NOREF; | ||
ud->client.cb_disconnect_ref = LUA_NOREF; | ||
ud->client.hold = 0; | ||
ud->client.num_held = 0; | ||
ud->client.connecting = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should ud->client.closing
also be initialised here, or is that only relevant for UDP?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's initialized for TCP by fall-through to UDP case ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, missed the fall-through, sorry!
components/modules/net.c
Outdated
{ | ||
SYS_ARCH_DECL_PROTECT(lev); | ||
|
||
if (!netconn) return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest moving the early return up above the SYS_ARCH_DECL_PROTECT
- at first reading it had me worried about leaving something locked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
components/modules/net.c
Outdated
// connection established, trigger Lua callback | ||
ud->client.connecting = false; | ||
post_net_connected(ud); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line left intentionally blank? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, to take a deep breath 😁
components/modules/net.c
Outdated
// new connection available from netconn_listen() | ||
if (ud->netconn && | ||
ud->self_ref != LUA_NOREF && | ||
ud->server.cb_accept_ref != LUA_NOREF) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if we haven't got the self_ref
or cb_accept_ref
? Do we leak connections? Should we do accept+immediate-close if we can't accept for real?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, good point, maybe. Never checked that. Is it possible to trigger this corner case at all? Not with net.listen()
, it enforces a callback. Probably with a race when a client connects while the sever is being closed...
I can't accept right here, would need to feed this event down the complete task pipeline and handle in laccept_cb()
. I observed nasty things happening when netconn_
functions are called from within lwip's thread. Symptoms are
I (25309) wifi: bcn_timout,ap_probe_send_start
I (27819) wifi: ap_probe_send over, resett wifi status to disassoc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove the check for self_ref
since it's not used at all in laccept_cb
. It would open up a race condition in net_listen()
when netconn_listen()
was executed before the ref is actually set.
The check for cb_accept_ref
fails when the server is in the process of being closed / shut down. Not accepting is ok in this case since the pending client request was stored in the server's netconn accept-mailbox and is going to be tcp_abort
'ed & freed when the server netconn is finally closed and deleted.
if (ud->client.cb_sent_ref != LUA_NOREF) { | ||
lua_rawgeti(L, LUA_REGISTRYINDEX, ud->client.cb_sent_ref); | ||
lua_rawgeti(L, LUA_REGISTRYINDEX, ud->self_ref); | ||
lua_call(L, 1, 0); | ||
} | ||
} else if (ud->type == TYPE_TCP_CLIENT) { | ||
err = tcp_write(ud->tcp_pcb, data, datalen, TCP_WRITE_FLAG_COPY); | ||
size_t bytes_written; | ||
err = netconn_write_partly(ud->netconn, data, datalen, NETCONN_COPY, &bytes_written); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably bytes_written
should be checked and we should do something if we can't send it all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to check what happens in this case.
if (ud->netconn && ud->type == TYPE_TCP_CLIENT && !ud->client.connecting) { | ||
ud->client.connecting = true; | ||
netconn_connect(ud->netconn, addr, ud->port); | ||
} else if (!ud->netconn && ud->client.wait_dns == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you really want ud->netconn
, not !ud->netconn
here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's existing logic, I just replaced !ud->pcb
with !ud->netconn
.
Not sure which corner case is covered here: no pcb/netconn and no pending dns request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I misread that line! I somehow saw it as !ud->netconn && ud->netconn.wait_dns
and thus took the first part to be a null pointer guard.
|
||
if (p) { | ||
netbuf_delete(p); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does payload
need to be free'd explicitly, or is it linked to the netbuf
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latter. netbuf_data()
just provides the data pointer and length of the embedded pbuf, while netbuf_delete()
frees both the netbuf and the pbuf.
components/modules/net.c
Outdated
|
||
case NETCONN_EVT_ERROR: | ||
post_net_err(ud, netconn_err(ud->netconn)); | ||
ud->client.closing = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this flag being set, but I don't see us acting on it to force a close. Is that not needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
netconn
triggers the callback multiple times in case of an error and when the connection goes down in general. The flag is used some lines above to prevent us from further posting to the Lua task. Lua cbs would operate on outdated/closed/freed userdata otherwise.
components/modules/net.c
Outdated
if (!netconn) return; | ||
|
||
SYS_ARCH_PROTECT(lev); | ||
if (netconn->socket < 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing initialises netconn->socket
to -1
from what I can see? Or does the netconn
layer do that implicitly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's even a "feature" of netconn
to support the socket
layer.
I borrowed this logic to record early receive events from there.
Thanks for the review, Johny. I'll follow up on the pending topics soon.
The PLUS/MINUS event mimic appears to be specifically tailored to the needs of the
Raw events from
Err, this was the worst part in the porting job, really 😃 |
and set it consistently also for intended netconn close to prevent any cb activity
I've pushed the last update to your review feedback. |
Can someone please point me towards the documentation of this newly committed net module for esp32. Thanks! |
It's the same Lua interface as before, just the underlying implementation that changed. Just keep on using |
Except there's no |
See #2053. |
@AadiMehta thanks to @devsaurus there's now https://nodemcu.readthedocs.io/en/dev-esp32/en/modules/net/ |
@devsaurus @marcelstoer Thanks a lot! I am in the process of migrating nodemcu lua code from ESP8266 to ESP32, and this helps a lot to debug issues. Also eagerly waiting for other modules support in ESP32! |
Fixes #1970, #2014, #2044.
docs/en/*
.This PR ports the
net
module from raw API to netconn API in order to safeguard the inter-process communication between the NodeMCU task and lwip's tcpip_thread tasks. Refer to the Multithreading section in http://git.savannah.nongnu.org/cgit/lwip.git/tree/doc/rawapi.txt.While this generated a high number of changes in the source file, the overall code structure almost completely remained the same:
tcp_*
andudp_*
functions were replaced with theirnetconn_
counterpartsl*_cb
functionsThe netconn API provides only one single callback function for event handling to the application code in contrast to the dedicated callbacks of
tcp_accept
,tcp_send
etc. This generated the main structural difference since all events are fed into this single callback function first and have to be de-muxed to the specific handlers (which already existed and were re-used).I tried to consider @djphoenix's results regarding sockets in TIME_WAIT state from #1838 but could only implement one of the two aspects:
LWIP_SO_REUSE
is enabled by default in sdkconfig.defaults but there's currently no clean way to decreaseTCP_MSL
in the sdk source tree. There's pending PR espressif/esp-idf#783 to exposeTCP_MSL
in menuconfig - once it's done we can set it to 5000 in sdkconfig.defaults as well.Thus expect the default timeout of ~2 minutes until a closed server socket is finally deleted and heap is freed. Manually modify
sdk/esp32-esp-idf/components/lwip/include/lwip/lwip/priv/tcp_priv.h
if you need a workaround.