-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RTOS-SDK, ESP32 and the way forward #1319
Comments
@jmattsson I will up the priority on the net over LwIP. I really wanted to go that way anyway, but this was the push that I needed. About to hop on a Ferry and travelling for the rest of the day, but will post back at the weekend. 😄 |
Is this improvement also applicable to our current firmware? |
It could probably be moved over to the regular SDK with some care, but I haven't looked at that. |
Okay, I'm at a Lua prompt now! NodeMCU 1.5.1 build unspecified powered by Lua 5.1.4 on RTOS-SDK 1.4.0(c599790)
lua: cannot open init.lua
>
=node.heap()
39240 The Is this the point where I suggest we get this branch into the official repo and let everyone loose on it to try to bang it into shape? After we make sure it's hidden in the cloud builder, of course @marcelstoer . I'm now eagerly looking forward to getting my hands on the ESP32 dev board from @nodemcu! :) |
Hurray Johny, sounds exciting!
Definitely! However, I suggest to track of the challenges you encounter in a separate issue on GitHub. Otherwise, this one will soon become confusing and hard to follow. It would help if we could define new labels ourselves @nodemcu so we could create
I used to pull active branches from GiHub API (really cool API) and then blacklisted some. However, a few months ago I switched to statically define |
On 31/05/2016 00:12, Johny Mattsson wrote:
|
Changing This is currently a show stopper for my dev environment with ESPlorer (and potentially other similar tools). |
Thanks for diagnosing that!
The old-style redirect we were using won't be possible with the pre-empting RTOS, as we'd at "best" end up running Lua code reentrantly from whichever RTOS tasks are printing. I suspect what we'll need is to install a proper handler via |
@marcelstoer Thanks, I hadn't realised you switched from black-list to white-list! I've pushed a @devsaurus I believe I've taken care of the newline madness now. I've yet to address the output redirect. |
Thanks a bunch, Johny! That'll allow me to have some test drive with the new rtos flavour. Output redirection is not that important atm, as long as there's serial comm 😃 As a side note: |
The RTOS SDK is a git submodule so a simple |
Also
|
Both of those look like they're due to stack overflow (again). This seems like it's going to be our biggest pain point in this transition - we have stuff that uses way more than the 2k heap RTOS wants to let us use. If we say each function call uses ~100 bytes of stack on average, the 2k stack limit should still allow us to go roughly 20 functions deep. I suspect something is placing too large arrays or objects on the stack, in a frequently used code path. Help tracking this down would be much appreciated. Edit: you can increase the nodemcu RTOS task stack size in user_main.c, but that obviously directly impacts free heap, and I don't yet know why the SDK docs state that 2k is the upper limit since we're already running past that. |
@devsaurus Output redirection should now work again. |
I increased the nodemcu task stack size to 6k, see if that helps for now? |
I've been pondering this and the issue that we have with Lua is that the execution engine is intrinsically single threaded. Yes, coroutining is supported but this is cooperative and non-preemptive. Yes, on a real OS you can have multiple Lua environments running but these can't interact except though OS mechanisms, and you just don't want to go there on a ESP-class processor. All of this wasn't an issue with the non-OS SDK since this was also non-preemptive (at least in terms of non-ISR code), but as Johny has pointed out callbacks in RTOS are (or can be) invoked asynchronously in a separate C stack space, and there are a legion of bears traps here. I think that we should be thinking about extending our model for ISRs and adopt an asymmetric Lua-land / other-land approach. We can' use a symmetric mutex approach because of the Lua process's heavy stack use; We can't (at least on the ESP8266) allow multiple tasks to demand a large stack. I feel that we should think in terms of a 1+N structure where the Lua task is "special" and that all callback tasks do a task post which queues a request to the Lua task when they need to xfer control into Lua-land. Anyway just musings whilst I build my house. 😃 |
Of course, While compiling the fw up and down I got the impression that the Makefile processing is a bit odd. The |
@devsaurus Thanks, I've cleaned up the Makefile now. @TerryE I agree 100%. Thinking of callbacks as being executed in interrupt context is the best (if not only) workable approach. I was briefly entertaining the notion of implementing the |
I've started a wiki page to track overall progress. Feel free to expand it. |
Thanks for that - I didn't get the intention with lwip in the first place, but it's definitely calmer now. Though I do want to clean lwip for the time being in order to recompile and check stack usage there as well 😉 In this respect, please find a ((very) clunky) approach to obtain info from |
Ooooh! Building new toolchain now! |
@devsaurus Not looking quite right yet:
whereas the .su says:
The stack pointer must always be 16-byte aligned, btw (ref Xtensa ISA reference, p587), and since the return address almost always needs to be stored I'd expect every function to at least have 16bytes of stack usage. Edit: rounding up to the nearest 16byte boundary makes it look pretty good though, so this should be fairly representative:
Further edit: And now I see that we're missing the |
That's very valuable input for a sanity check, thanks. I'll investigate later this day to find out how to report gross stack usage. |
@jmattsson Johny, apart fromthe performacne and code size implications, stay well away from
|
Looks better now:
Although it's still more guesswork than code-fu 😊 Seems that the 16 bytes penalty is still not considered, but the change fixed a severe miscalculation of the net stack use. The leaderboard changed significantly... You will now also get
|
That's a lot of |
@jmattsson Johnny, I've just been going through a review of what I'd need to do to the net library to port it in a nonOS SDK + ESP8266 / ESP32 RTOS SDK way, and to that end I've been comparing the documentation for the two RTOS APIs. In short, the ESP32 has a few additions:
It also has some big omissions:
OK, the additions partially reflect new H/W capability, but what isn't clear to me is that any specific omission is a permanent removal or simple a temporary omission because the ESP32 SDK is still in beta and that Espressif will add this back before a V1 production release. I really don't want to spend a lot of effort effectively reimplementing something that gets added back before I am done. I'll have a trawl around the ESP32 forum to see what I can find here, for example: Network Espconn APIs. |
There's a fair bit more undocumented support on the ESP32, especially in the RTC co-processor area (can't wait to find out how to build code to run that core too!). It's not yet clear which model(s) Espressif will support with the two main cores. I've seen references to both running the entire chip as an SMP RTOS system, but also a "split" version where the WiFi stack runs on the "pro" core and the application has the "app" core to itself. Currently there is only support for running a single-core shared RTOS on the "pro" core. Regarding omissions, yes, there are certainly some, but they were honestly surprisingly few to me. I don't know if you've been keeping an eye on the dev-rtos branch, but I'm at the point where that branch can now build and link for the ESP32 as well (can't/haven't run it yet though; my ESP32 seems to have trouble with its flash chip - we've ordered replacements for next week to try a swap). The things I ended up ifdef'ing out for now were only espconn, RTC, RF modes and some bus drivers (SPI, I2C) due to hardware differences. I didn't have to change any of the hardware timer (FRC) stuff, other than provide suitable compatibility macros. Not sure which timer API you are referring to here? Also worth noting that in RTOS the In terms of upgrade APIs, I'm still not sold on the Espressif way. We already have two competing, working implementations for the non-OS SDK version. I'd rather try to make those two compatible with each other and port that over to the ESP32. For meshing, I'm guessing this will come later, just as it did for the 8266. Considering we haven't used it yet, this should be fine. Besides, the whole ESP-NOW protocol needs to evolve and stabilise a bit further first imo - it's got some serious drawbacks which makes any real life deployment challenging. With the espconn bugbear I did see that mention of a compat layer for ESP32 RTOS, but never for the secure version (there is none available even for ESP8266 RTOS). The TLS library also looks very different. Even if they were to provide both espconns, we'd still have the issue that it's not possible to shut down a TCP server safely. Also, the lwIP native API is a much better fit in the RTOS model, since |
My first reaction was NO, keep support ESP8266 is much more important! To be honest, I am disappointed with ESP32. (Offer very little what I need.) Second thought, maybe, current NodeMCU is stable enough with many features. Why not only focus on ESP32? But my concern is ESP32 does not have enough interest as when ESP8266 was launched. We can run a simple poll. |
Johny, given that the ESP32 part seems to be shipping for ~ $5 and this will probably fall, I can't see the ESP8266 lasting long. At best it will be shipped as a low cost "sustain" component to support existing production uses. Speaking purely personally, the $1-2 price point isn't important for me. The complete WiFi integrated SoC module is, and the ESP32 seems to address all of the annoyances and constraints of the 8266, so I would personally vote for a switch to the 32 for future development. The RTOS / IDF stack of the 32 vs. the non-OS SDK of the 8266 will make it very difficult to maintain a common code base going forward, I think so we've got some hard calls to make ahead. PS. the stone skin of my new house will be finished in a few weeks and we are waiting for the plastering team to board out, so the silly 6-7 days a week should slacken off soon. I am suffering Lua withdrawal symptoms, so its getting time to get back on-board and catch up, I think 😄 |
I doubt ESP32 will be as popular as ESP8266. And ESP32 is still at least 1-2 year away from mass production ready. This thread only have about 5 people, since May, that says something. |
As some of you might've noticed, I've just established the As of writing, that branch has got the console UART functioning (except auto-baud), and I've just finished getting SPIFFS to work today. NodeMCU now uses an explicit partition for the filesystem, rather than magically deducing free space and dropping the fs there (an approach which wasn't a good mix with partitions!). I'm sure @pjsg could polish it further though, and over time we'll need to consider support for embedding a readonly fs within the app, but that's for later. The next things on my list are to grab some more of the node module functionality, and some of the basic WiFi functions. Once the WiFi is up I'll grab the native-lwip Of the code in the esp32 branch, the one feature I can't yet enable is the FATFS option in the build since I haven't got the sdcard/spi support ported yet. Someone else is most welcome to look at that. I've started making developer notes in the extension dev FAQ but it's rather light-weight so far. Somewhere I guess I should document the following:
which is the TL;DR for this branch. There may need to be something about @luismfonseca Other than porting modules over from the latest Oh, and of course, testing the functionality that has been ported so far is always welcome, but I appreciate ESP32s are still a bit rare. @TerryE Yeah I expect the 8266 will hang around for years to come, but for new designs the 32 is certainly a tempting option. It's not that long ago where the 8266 was priced in a similar range, and the 32 is a ridiculously powerful chip for the price-point! Good to hear the house building is progressing well; looking forward to having you back on board and butting heads with me over technical details :) @mikewen Years away from production ready? Nah, chip production is already (finally!) rolling and module production is ramping up. By xmas I imagine dev boards will be freely available. And back when the ESP8266 was launched, few took notice about it. It took quite a while for it to really break into the hacker/maker community largely due to lack of docs and tools. Espresssif is really working with the community this time, and I expect the 32 will get an overall quicker uptake, tbh. @devsaurus I've been hitting some stack overflows in the Lua thread even with a significantly larger stack, so if you're feeling adventurous you could look at getting the |
@jmattsson Johny, I've been brooding about the Lua architecture. As you know NodeMCU is build on eLua which was built and tested on the assumption that the Lua interpreter is non-reentrant, and hence the VM can only execute a single Lua thread and any multi-threading must be cooperative. IMO, we should stick with this on the ESP32 because moving to a thread-enabled VM is gong to require a lot of regression testing and fixing some subtle and unknown dependencies on the single thread assumption in the RTS libraries. Given the asymmetric nature of two processors (though did I notice references to making the latest RTOS versions SMP?) I don't see this as a major impediment. However we need to have some clear guidelines for library writers interacting with Lua-land.
More thought needed :) |
@TerryE It's almost fully SMP now. There are a handful of things which can only be done from one core or the other, but pinning a driver task/thread to that particular core is trivial (if we ever need to use those things - off the top of my head I can't remember what they area). And yes yes yes we're sticking with a single-threaded LVM thank-you-very-much! I'm not debugging the monstrosity that would otherwise appear! :D I already ported the NodeMCU task API when I did the original RTOS work, so that side is covered. Getting everyone to remember to post/queue things from within SDK callbacks rather than calling directly into the LVM will be the challenging part. I've tried to cover this the dev docs I've been writing so far, but it's certainly a point that bears hammering in. I wonder if I could convince Espressif that their APIs should take a "results-posted to-this-task-please" approach over the current direct callback way... |
It will rather be "Results posted to this queue please". We are, indeed, going to move away from callbacks. So I apologize for breakage caused before we reach 1.0. |
@jmattsson Johnny, Rereading this whole thread I realise that I must seem that I am going senile because I keep repeating myself. 😆 The problem is one of bandwidth: I need to allocate ~30min every day just to keep up with what is happening on the list, and I just haven't had that time so have got out sync. I need to do some deep reading to catch up, and avoid stating the already stated. Sorry. I'll drop you an email separately. |
@igrr That's great news, Ivan! Thanks for letting us know. That approach will certainly make everyone's life easier. Any rough idea on time frames for this to start appearing in the IDF? And which areas might get it first? I'm just trying to get an idea on how I might best plan my work. @TerryE Hahaha, you're excused! I remember how hazy I was back when I was doing the multi-year reno/build for my house, so I'm not going to judge. |
This change needs to land before 1.0 release, which should happen around Oct 1st. If you have some proposals about the way you would like to see this API, i encourage you to open an issue at https://github.com/espressif/esp-idf. I will be refactoring the startup procedure on Monday, my plan is to remove |
Sure, what repo/branch are you building from? |
I'm using this build script. |
The stack-usage patch is already in some fork out there. I'll cherry-pick it to mine and would check porting the callgraph info thing as well. |
Cool, you could even raise a PR against the Espressif fork when you're ready, I'm sure others would love this functionality too! |
|
@devsaurus cool! Dare I ask for the callgraph-info too? :D I'll see if I can get some time next week to update the prebuilt toolchains. |
Thanks, I can incorporate those changes into pre-built toolchains provided by Espressif. Oh and by the way, we do have an image in the docker hub which is made specifically for CI use: espressif/esp32-ci-env. |
@igrr I like the refactoring work you did - it made it quite easy to slip into the event queue handling neatly! General ESP32 progress update:
|
@jmattsson that was the plan 😃 but took some more time. It just landed at @igrr do you want me to place a PR for stack-usage and callgraph-info against the Espressif repo? |
All, this thread has evolved into a primary topic: The way forward for NodeMCU on the ESP32 and which I support BTW, but with the new IDF and its API incompatibilities with the legacy RTOS SDK + and the intractable resource issues of getting a functional NodeMCU implementation on the ESP8266 over RTOS, I suggest that the de facto is that we are going to be left with two platforms gong forward:
Given this dichotomy, what I wonder is: should we accept this and fork NodeMCU or should we at least attempt to unify these two diverse approaches at some level within the NodeMCU Lua:
None of this discussion should detract from or impede what you guys are achieving with the ESP32 port, but I feel that in principle these objectives are more doable than might initially appear. I also feel that achieving this goal will materially easy the migration path for our developers. The the two issues are largely independent, but not entirely. In terms of the common code base, we would need two localisation hierarchies: ESP8266 and ESP32 together with separate platform abstractions with a unified abstraction API at some level. However the ESP8266 non-OS model is non-preemptive event driven and the ESP32 RTOS model is pre-emptive and procedural. My view is that at the low C level, these are fundamentally at odds. However, surely the approach here is to see if we can define a unified abstraction at the Lua level. I think that Lua coroutining might just be the magic bullet. I don't want to hijack this thread further, but what is the best way to have this debate? As a separate/new thread? A white-paper / RFC for discussion. |
Why not use mongoose as its network library...It supports many protos including webdav .so it is easy to manage files in flash💪💪 |
Time for another ESP32 progress update! Thanks to $boss at $work I've been getting some decent time to work on this, and the
All that said, this will be the last you see from me for a bit. Next week I'm going on annual leave and will be heading overseas, so no coding for me for a few weeks. I may be able to keep an eye on github discussions, but don't be surprised if I take a while to respond. Once I'm back, I'm hoping to do a first cut of a If others want to start working on the ESP32 branch (did any of you manage to snag an Adafruit board before they sold out again?), do feel free. We're still waiting on Espressif to start providing more drivers (i2c, spi, etc), so many things aren't ripe for adding yet, but hardware agnostic modules should be pretty easy by now. Kconfig ftw. We're also expecting to get an official interrupt-allocation API from Espressif, so if you're wiring up ISRs, I'd suggest hard coding for now (see console.c) rather than rolling our own allocator. |
I'm closing this mega-thread in favour of the individual "ESP32:" issues/todos set up yesterday. Marcel also set that up as a Project here, making it even easier to find. |
Should we delete the |
I don't feel I'm qualified to have an opinion about that but I thought exactly the same the other day when I noticed that branch. |
There's still a lot of |
Ok, I understand that. |
Edit: the below progress update refers to the
dev-rtos
branch which targeted the RTOS-SDK and the ESP31B. With the final release of the ESP32, Espressif abandoned the RTOS SDK in favour of their new IoT Development Framework (IDF). While the IDF is vastly superior to the previous SDKs, it does set our porting effort back a fair bit. Progress updates on the IDF/ESP32 port of NodeMCU can be found further down in this discussion.With the ESP32 release coming up in a few months time, it's time to seriously start thinking about the way forward. I think it's a given that we'd all like to see NodeMCU run on the ESP32 as well. With the ESP32 there is only the RTOS SDK however, which means we really need to consider how to get ourselves switched from the non-OS SDK over to RTOS.Since $work is rather interested in shifting some of our products over to the ESP32 I've had a bit of time to investigate the effort that will be required in terms of NodeMCU. I've been "spiking" over on the DiUS dev-rtos branch to see what I can get going. Here's the overview so far:sdk-overrides/
directory which would need cleaning up, but overall this step wasn't too bad - the SDK functions are largely the same.c_
prefixed functions (and a bunch ofos_
prefixed ones) have been consolidated back to standard C library names.flashchip
variable from a pointer to a struct, so our use of it bombed completely...printf()
now doesn't work, butets_printf()
does.printf
now working, without bounce buffers.tmr
module, the whole thing appears to be easier to just change in each place where needed than attempt to wrap everything. Besides, having high-priority timer callbacks might be useful for some drivers.rtT
high priority timer task and thetiT
TCP/IP task, not to mention theuiT
task which is whatuser_init()
runs in. It will be up to everyone who is taking an SDK callback to either deal with it fully within that callback without referencing data used by other tasks, or copy the necessary information from said callback and relay it back to the main nodemcu RTOS task. Appropriate locking must be done though. I've updated thesntp
module as an exercise, and while it grew a little bit it was pretty straight forward. [cue everyone pointing out things now wrong with it...]system_set_os_print()
function in the RTOS-SDK and wants you to install a putc handler to suppress everything instead (which is useless, since then we'd have to put a mutex around each printf call if wantsystem_set_os_printf()
like functionality). I fixed this by placing all the SDK functions first in irom, and then in the wrappedprintf()
checking the return address - if it's in the SDK part of irom and we've flagged off SDK prints, then we suppress it. Rather sneaky, but works well and with almost no cost.print()
function takes peculiar arguments, but we now match those to the letter I believe.net
module (and others) on top of lwIP API, sinceespconn
is only partially supported on the ESP8266 RTOS, and not at all on the ESP32 RTOS. Probably look at including mbedTLS for TLS support.If we can get our current NodeMCU to run stable on the ESP8266 with the RTOS SDK, it should be quite easy to get ESP32 support in I believe. If/when I get my hands on ESP32 hardware, I'll have an even better idea.Oh, and the dev-rtos branch is subject to force-pushing and other unpleasant things, and it is most assuredly not ready for public consumption, but if you want to track my progress you'll see it there.The text was updated successfully, but these errors were encountered: