Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend "WiFi" class with support for access point mode #57

Closed
squonk11 opened this issue Aug 10, 2019 · 33 comments
Closed

Extend "WiFi" class with support for access point mode #57

squonk11 opened this issue Aug 10, 2019 · 33 comments

Comments

@squonk11
Copy link
Contributor

In my application for wifi both modes (station and access point) are needed. The current implementation of the WiFi class supports the station mode only. So, an extension of the WiFi class for access point mode is needed. I will try to implement it and provide a PR.

@PerMalmberg
Copy link
Owner

Does this mean a captive portal etc, or is there another purpose?

@squonk11
Copy link
Contributor Author

In the first use case I want to use the ESP32s for some home automation (switching light, garden pump, robot mower etc.) - in this case the station mode is needed. In a second use case I plan to implement a wireless module for an industrial device; here the ESP32 needs to be in access point mode and communicates via serial interface with the device. In this application the performance (throughput) is quite important in order to have a fluffy user experience. For this I need to transmit small telegrams (~~20 bytes) in request/response mode as fast as possible (e.g. <10ms). For this I need websocket communication in order to reduce overhead. Additionally I plan to transmit bigger data chunks (~170kB?) from the device to the browser - the faster the better.

@PerMalmberg
Copy link
Owner

Industrial device? Now it got serious :)

As to the response times, I've not done any real bench marking, have you?

@squonk11
Copy link
Contributor Author

The industrial device is an electrical drive. The wireless interface for it in the first step shall be some kind of a proof of concept in order to check the possibilities it will give. But for the moment it is just my personal hobby. Then we will see. If it shall become a product we need to find a solution how to deal with the license of this library (GPL). Either the complete source code must be published or a different library has to be developed.
Until now I did not do any benchmarking with this library. Two years ago I experimented a little bit with mongoose - just CGI no websocket. The 10ms were not possible - but almost, as far as I remember. Concerning the speed I have some doubt about the wide use of STL etc. in Smooth - I am not sure about the overhead this might have. I will see.

@PerMalmberg
Copy link
Owner

If it shall become a product we need to find a solution how to deal with the license of this library (GPL)

If it comes to that I'm open for discussion.

Concerning the speed I have some doubt about the wide use of STL etc. in Smooth - I am not sure about the overhead this might have.

I'd dare say you can stop worrying about that. It's an old fallacy that keeps popping up for no good reason. Sure, it might be possible to write specialized algorithms that performs better for a specific task, but generally the stl-algorithms do a better job than most programmers ever will.

@squonk11
Copy link
Contributor Author

I am currently implementing the access point support. Here I have a few questions:

  1. I see that you set the status information "GOT_IP" in core::ipc::Publisher within the wifi event handler. Is something like this also needed for the access point mode? What do you do internally with this status?

  2. For the event system you are using system_event_t. As far as I understand the esp-idf documentation system_event_t is set to legacy. Could it be better to use the esp_event_t API? But in this case the IDFApplication class also would need to be reworked.

@PerMalmberg
Copy link
Owner

There are a few subscribers to the GOT_IP, but mainly the SocketDispatcher that handles all sockets. It's the signal it uses to know when to restart server sockets etc. For sockets to function correctly, GOT_IP and DISCONNECTED must be published accordingly.

Yes, it is legacy, I haven't gotten around to change it to the new API. I suggest you stick to the legacy one, then it can be changed out later as part of another enhancement effort.

@squonk11
Copy link
Contributor Author

I now implemented the support for wifi in access point mode. So far it was not so difficult. But unfortunately I am encountering a problem when disconnecting a station from this access point. In the log I see the error message:

I (89855) wifi: station: 60:67:20:2b:ba:fa leave, AID = 1, bss_flags is 134243, bss:0x3f818664
I (89856) wifi: new:<1,0>, old:<1,0>, ap:<1,1>, sta:<255,255>, prof:1
I (89858) SoftAP: station 60:67:20:2b:ba:fa leave, AID=1
W (89859) SocketDispatcher: Station disconnected or IP lost, closing all sockets.
I (89861) Socket: [0.0.0.0, 8080, 54, 0x3ffe58f4]: Server stopping
E (89863) SocketDispatcher: Shutdown error: Invalid argument
I (92174) SocketDispatcher: Active sockets: 0

I guess this is related to the fact that there needs to be some additional action in the SocketDispatcher. But I currently can not find out how to react on this event.
Do you have a hint for me?

@PerMalmberg
Copy link
Owner

Hm, so this is when, for example, a phone disconnects its WiFi from the ESP32? I'd not expect he disconnected event to be received then, since the network is still intact, right? Or is it so that the SoftAP only can handle a single client and that the network is restarted when one connects?

As to the error itself, I think the case is that the SocketDispatcher receives the disconnected event after the socket already has been closed by the underlying network layer, i.e. the socket ID (54 in this case) is no longer a valid ID. I don't think the error itself is critical, but it'd of course be best if it didn't happen. Is there an event to listen to when the SoftAP net is closed that comes before the one triggering the error? If so, listening to that too and closing sockets then might help.

@squonk11
Copy link
Contributor Author

I think the wifi-event SYSTEM_EVENT_AP_STADISCONNECTED occurs before the SocketDispatcher event. I can try to close the socket in the wifi event handler. What do I have to call in order to close the socket?

@PerMalmberg
Copy link
Owner

PerMalmberg commented Aug 13, 2019 via email

@squonk11
Copy link
Contributor Author

I think here the problem is:

  • every time a station connects to or disconnects from this AP all existing socket connections are closed (in void SocketDispatcher::event(const NetworkStatus& event)). This does not make sense for an AP since there might be other clients connected which should not be closed. Only those Sockets should be closed which are coming from the client which disconnects.

I am trying to find out how this can be accomplished. For this it seems I have to dig deeper into the implementation of the socket connections and the according events.

@PerMalmberg
Copy link
Owner

Perhaps its as simple as not sending the disconnected event at all when in AP-mode? Or at least only when the AP is taken offline? Any socket that otherwise disconnects from a client will be closed and properly handled anyway.

@squonk11
Copy link
Contributor Author

I tested the following:

  1. remove the emission of the GOT_IP and DISCONNECTED status from the wifi events SYSTEM_EVENT_AP_STACONNECTED and SYSTEM_EVENT_AP_STADISCONNECTED handling. In this case I am not able to connect to the server. So, at least GOT_IP is needed.
  2. Then I put GOT_IP emission into the handling of the event SYSTEM_EVENT_AP_START. In this case I can connect and see the web page. But after the loading of the page is finished I see in the log the message I (693042) wifi: ampdu: ignore deleting tx BA0 and the connection becomes very slow. I saw this discussion on this issue. It seems to be a known problem. I need to read more about it but not this evening.

@PerMalmberg
Copy link
Owner

Yeah, GOT_IP still needs to be published, otherwise the SocketDispatcher won't know that there's an active network.

I've seen those ampdu-messages too, but never given them any thought as they are logged as "informational".

@squonk11
Copy link
Contributor Author

After fighting with some stability problems I updated my http_server_test with your latest version. Now I was able to get the wifi AP code running.
But during the testing I detected a problem which also exists in station mode: if I try to start a websocket connection while the web page (with the 1000 icons) is still loading, the websocket connection doesn't get established (probably due to missing free socket connections - which is o.k. so far). But sometimes after the web page finished loading the ESP32 crashes:

I (444124) HTTPServer: Request: GET: '/img/lightbulb_off.png'
I (444166) HTTPServer: Reply: Not Modified
W (445481) ServerSocket: No client available at this time
W (446992) ServerSocket: No client available at this time
W (448495) ServerSocket: No client available at this time
W (448929) SocketDispatcher: Receive timeout on socket 0x3ffe8dcc (5000 ms)
I (448931) Socket: [, 0, 58, 0x3ffe8dcc]: Receive timeout
I (448932) Socket: [, 0, 58, 0x3ffe8dcc]: Socket stopping
I (448949) Socket: [, 0, -1, 0x3ffe8dcc]: Disconnected
W (448988) SocketDispatcher: Receive timeout on socket 0x3fff0c4c (5000 ms)
I (448990) Socket: [, 0, 56, 0x3fff0c4c]: Receive timeout
I (448991) Socket: [, 0, 56, 0x3fff0c4c]: Socket stopping
I (449009) Socket: [, 0, -1, 0x3fff0c4c]: Disconnected
W (449035) SocketDispatcher: Receive timeout on socket 0x3ffed184 (5000 ms)
I (449036) Socket: [, 0, 57, 0x3ffed184]: Receive timeout
I (449038) Socket: [, 0, 57, 0x3ffed184]: Socket stopping
I (449054) Socket: [, 0, -1, 0x3ffed184]: Disconnected
W (449079) SocketDispatcher: Receive timeout on socket 0x3ffe6d98 (5000 ms)
I (449080) Socket: [, 0, 55, 0x3ffe6d98]: Receive timeout
I (449082) Socket: [, 0, 55, 0x3ffe6d98]: Socket stopping
I (449098) Socket: [, 0, -1, 0x3ffe6d98]: Disconnected
W (449123) SocketDispatcher: Receive timeout on socket 0x3ffeb388 (5000 ms)
I (449125) Socket: [, 0, 59, 0x3ffeb388]: Receive timeout
I (449126) Socket: [, 0, 59, 0x3ffeb388]: Socket stopping
I (449143) Socket: [, 0, -1, 0x3ffeb388]: Disconnected
W (449179) SocketDispatcher: Receive timeout on socket 0x3fff5c64 (5000 ms)
I (449180) Socket: [, 0, 60, 0x3fff5c64]: Receive timeout
I (449182) Socket: [, 0, 60, 0x3fff5c64]: Socket stopping
I (449198) Socket: [, 0, -1, 0x3fff5c64]: Disconnected
I (450004) ServerSocket: Connection accepted
Guru Meditation Error: Core  1 panic'ed (LoadProhibited). Exception was unhandled.
Core 1 register dump:
PC      : 0x40103384  PS      : 0x00060430  A0      : 0x800dd86c  A1      : 0x3ffd6e60

But as already mentioned this happens in AP mode and in STA mode.

@PerMalmberg
Copy link
Owner

Are you still using the fixed libc.a from espressif/esp-idf#3624 (comment) ?

@squonk11
Copy link
Contributor Author

yes, at least I did not change anything on the toolchain since I copied the libc.a into esp32-psram directory. Is there anything new available?

@squonk11
Copy link
Contributor Author

Another question: I was also testing the upload test in http_server_test. Here I noticed that the filenames of the uploaded files are always named "file_to_upload" and "second_file_to_upload" and not the original filename. In order to achieve this I modified "MIMEParser.cpp" in line 220 to: cb(content_dispositon["filename"], start_of_content, end_of_content). Isn't this the better choice rather than using the field "name"??

@PerMalmberg
Copy link
Owner

yes, at least I did not change anything on the toolchain since I copied the libc.a into esp32-psram directory. Is there anything new available?

No, not that I know of. Can you reproduce it? I've had no crashes since I started using the new libc.a, though I'm thinking that since Espressif still haven't released it they are still working out some kinks with it.

Isn't this the better choice rather than using the field "name"??

Well, it depends - on the server side you need to know which file belongs to which field so that you do the right thing with the right file. Perhaps both should be passed to the callback?

@squonk11
Copy link
Contributor Author

Concerning the crash I did many tests yesterday evening. Sometimes the server works for a long time without any problems and sometimes it crashes quite soon after a few page loads. I tested both with AP and STA mode and it seems that it crashes more frequently/faster in AP mode. Additionally in AP mode I have still the problem that sometimes messages I (693042) wifi: ampdu: ignore deleting tx BA0 appear and the response time / page load becomes quite slow. I will do some more testing this evening in order to check if it is reproducible somehow.
Concerning the filename issue: if I do the mentioned modification, both uploaded files finally have the right name on the SDcard. According to my understanding the "name" field can be used in order to identify to which filed of the form the file belongs and the "filename" is the physical filename of the uploaded file. Of course both name fields could be passed to the cb - then the user defined cb can decide which name field to use for naming the file. Could be an improvement.

@PerMalmberg
Copy link
Owner

Could be an improvement.

👍

As to the crash - at this time it is impossible to say where the problem originates. What does the periodic system statistics print to the console?

@PerMalmberg
Copy link
Owner

@squonk11 I've pushed pretty significant changes to Smooth into master regarding the build system. It now uses Espressif's standard way of building components etc so you'll have to adjust any project you are using Smooth in. I've added instructions in the readme.md file so it should be fairly simple.

If you're just building the test projects all you'll need to do is to select the project to build and regenerate the build files, also as per readme.md.

@squonk11
Copy link
Contributor Author

I used your sources already with a different build setup because I had difficulties using your build setup under Windows. Currently I am not hat home - I will update this evening.

@PerMalmberg
Copy link
Owner

Ok, hopefully this will make it easier.

Also, just committed #62 for the file name change.

@squonk11
Copy link
Contributor Author

squonk11 commented Aug 19, 2019

I still have problems with wifi. The following log happened in station mode:

I (30701) HTTPServer: Reply: OK                                                
I (30714) HTTPServer: Request: GET: '/img/application_xp.png'                  
I (30735) HTTPServer: Reply: OK                                                
I (31514) SocketDispatcher: Active sockets: 7                                  
I (31672) HTTPServer: Request: GET: '/img/application_xp_terminal.png'         
I (31698) HTTPServer: Reply: OK                                                
I (31710) HTTPServer: Request: GET: '/img/arrow_down.png'                      
W (31717) wifi: alloc eb len=24 type=3 fail, heap:4101424                      
                                                                               
W (31718) wifi: m f null                                                       
                                                                               
W (31720) wifi: alloc eb len=24 type=3 fail, heap:4099796                      
                                                                               
W (31721) wifi: m f null                                                       
                                                                               
I (31729) HTTPServer: Reply: OK                                                
I (31744) HTTPServer: Request: GET: '/img/arrow_inout.png'                     
I (31763) HTTPServer: Reply: OK                                                
I (31778) HTTPServer: Request: GET: '/img/arrow_branch.png'                    
I (31799) HTTPServer: Reply: OK                                                
I (31812) HTTPServer: Request: GET: '/img/arrow_divide.png'                    
I (31833) HTTPServer: Reply: OK                                                
I (31847) HTTPServer: Request: GET: '/img/arrow_in.png'                        
I (31866) HTTPServer: Reply: OK                                                
W (36728) SocketDispatcher: Receive timeout on socket 0x3ffe5ac0 (5000 ms)     
I (36729) Socket: [, 0, 57, 0x3ffe5ac0]: Receive timeout                       
I (36730) Socket: [, 0, 57, 0x3ffe5ac0]: Socket stopping                       
I (36744) Socket: [, 0, -1, 0x3ffe5ac0]: Disconnected                          
W (36746) SocketDispatcher: Receive timeout on socket 0x3fff09bc (5000 ms)     
I (36747) Socket: [, 0, 55, 0x3fff09bc]: Receive timeout                       
I (36748) Socket: [, 0, 55, 0x3fff09bc]: Socket stopping                       
I (36764) Socket: [, 0, -1, 0x3fff09bc]: Disconnected                          
W (36789) SocketDispatcher: Receive timeout on socket 0x3ffe43ec (5000 ms)     
I (36790) Socket: [, 0, 56, 0x3ffe43ec]: Receive timeout                       
I (36791) Socket: [, 0, 56, 0x3ffe43ec]: Socket stopping                       
I (36806) Socket: [, 0, -1, 0x3ffe43ec]: Disconnected                          
W (36829) SocketDispatcher: Receive timeout on socket 0x3ffe70e8 (5000 ms)     
I (36831) Socket: [, 0, 58, 0x3ffe70e8]: Receive timeout                       
I (36831) Socket: [, 0, 58, 0x3ffe70e8]: Socket stopping                       
I (36846) Socket: [, 0, -1, 0x3ffe70e8]: Disconnected                          
W (36860) SocketDispatcher: Receive timeout on socket 0x3fff45a0 (5000 ms)     
I (36861) Socket: [, 0, 59, 0x3fff45a0]: Receive timeout                       
I (36862) Socket: [, 0, 59, 0x3fff45a0]: Socket stopping                       
I (36877) Socket: [, 0, -1, 0x3fff45a0]: Disconnected                          
W (36901) SocketDispatcher: Receive timeout on socket 0x3ffec668 (5000 ms)     
I (36903) Socket: [, 0, 60, 0x3ffec668]: Receive timeout                       
I (36903) Socket: [, 0, 60, 0x3ffec668]: Socket stopping                       
I (36918) Socket: [, 0, -1, 0x3ffec668]: Disconnected                          
I (46522) SocketDispatcher: Active sockets: 1                                  

After this happened the log is ongoing but the browser is not able to get any new connection to the server.
In order to check if this is really related to using PSRAM I tried to disable PSRAM and work with an unsecure server only (no memory for TLS necessary). But when I do this I don't get the server running at all - it crashes during the startup phase or when the first connection appears. Do you have a hint what is missing in this case?
Edit: to be more precise on the error - here the log:

I (52638) ServerSocket: Connection accepted: 55
I (52672) HTTPServer: Request: GET: '/'
E (52683) File: Error reading file: std::bad_alloc
I (52685) HTTPServer: Reply: Internal Server Error
W (57702) SocketDispatcher: Receive timeout on socket 0x3fff8f7c (5000 ms)
I (57703) Socket: [, 0, 55, 0x3fff8f7c]: Receive timeout
I (57704) Socket: [, 0, 55, 0x3fff8f7c]: Socket stopping
I (57717) Socket: [, 0, -1, 0x3fff8f7c]: Disconnected
I (60557) SocketDispatcher: Active sockets: 1

Obviously there is an issue with allocating memory during file read. Do I need to increase some space for stack or heap somewhere?

Edit 2: Here is also the log of the memory usage:

I (15541) SocketDispatcher: Active sockets: 1
I (18528) MemStat: [INTERNAL]
I (18531) MemStat: 8-bit F:24992 LB:24340 M:15704 | 32-bit: F:84000 LB:59008 M:74704
I (18531) MemStat: [INTERNAL | DMA]
I (18533) MemStat: 8-bit F:24992 LB:24340 M:15704 | 32-bit: F:24992 LB:24340 M:15704
I (18533) MemStat: [SPIRAM]
I (18534) MemStat: 8-bit F:0 LB:0 M:0 | 32-bit: F:0 LB:0 M:0
I (18534) MemStat: Name Stack   Min free stack
I (18535) MemStat: MainTask     25600   21188
I (18535) MemStat: SocketDispatcher     20480   19640
I (21312) ServerSocket: Connection accepted: 55
I (21356) HTTPServer: Request: GET: '/'
E (21425) File: Error reading file: std::bad_alloc
I (21427) HTTPServer: Reply: Internal Server Error
I (26435) MemStat: [INTERNAL]
I (26437) MemStat: 8-bit F:11280 LB:8252 M:7584 | 32-bit: F:70288 LB:59008 M:66584
I (26437) MemStat: [INTERNAL | DMA]
I (26440) MemStat: 8-bit F:11280 LB:8252 M:7584 | 32-bit: F:11260 LB:8252 M:7584
W (26440) SocketDispatcher: Receive timeout on socket 0x3fffa0f4 (5000 ms)
I (26441) MemStat: [SPIRAM]
I (26442) MemStat: 8-bit F:0 LB:0 M:0 | 32-bit: F:0 LB:0 M:0
I (26442) MemStat: Name Stack   Min free stack
I (26442) Socket: [, 0, 55, 0x3fffa0f4]: Receive timeout
I (26443) MemStat: MainTask     25600   21188
I (26444) Socket: [, 0, 55, 0x3fffa0f4]: Socket stopping
I (26445) MemStat: SocketDispatcher     20480   19640
I (26459) Socket: [, 0, -1, 0x3fffa0f4]: Disconnected

The minimum free heap is 7584 bytes only - but isn't this enough for a malloc for reading a file?

@PerMalmberg
Copy link
Owner

PerMalmberg commented Aug 19, 2019

Starting from the last question:

E (52683) File: Error reading file: std::bad_alloc

The server is out of memory. You can see that you only have 11kb available:

I (26437) MemStat: 8-bit F:11280 LB:8252 M:7584 | 32-bit: F:70288 LB:59008 M:66584

Regarding WiFi:

W (31720) wifi: alloc eb len=24 type=3 fail, heap:4099796

This is a problem in IDF, waiting for a fix: espressif/esp-idf#3592 Apparently they have having issues with their CI/CD so it's not yet synced.

@squonk11
Copy link
Contributor Author

Since I assume/hope that also the crashes in AP mode are related to the same issue, I suggest to make now a PR for the modifications on wifi.cpp for the AP mode.
Basically, I just used the same logic as can be found in the esp-idf examples. In order to use the AP mode the command wifi.connect_to_ap(); must be replaced by wifi.start_softap();. As default only one client is allowed to connect to the AP. If more clients should be allowed wifi.set_softap_max_connections(x); can be used before starting the AP.
Have a look at the sources and check if it might suit your requirements in terms of code quality.

@PerMalmberg
Copy link
Owner

Sure, go ahead and make the PR and I'll have a look.

Just a quick thought: How about letting start_softap take an argument stating the number of allowed clients, with a default of 1?

@squonk11
Copy link
Contributor Author

squonk11 commented Aug 20, 2019

yes, you are right. I also thought about this possibility. But now I read your comment too late. Maybe you can do this while reviewing?
Edit: but I can do it also as soon as I am back home this evening.

@squonk11
Copy link
Contributor Author

Now I also included your suggestion to have start_softap() with a parameter for the maximum number of connections. Additionally I made some cosmetic changes.

@PerMalmberg
Copy link
Owner

Cool. I just tried using the latest IDF mater, unfortunately the problem with W (78383) wifi: alloc eb len=24 type=3 fail, heap:4090072 still persists.

I'll have a look at the PR now.

@PerMalmberg
Copy link
Owner

Closing this issue since #65 is merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants