-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LFS evaluation version -- second release #2301
Conversation
Linker warning for
|
app/Makefile
Outdated
@@ -20,93 +20,64 @@ FLAVOR = debug | |||
ifndef PDIR # { | |||
GEN_IMAGES= eagle.app.v6.out | |||
GEN_BINS= eagle.app.v6.bin | |||
SPECIAL_MKTARGETS=$(APP_MKTARGETS) | |||
OPT_MKTARGETS := coap crypto dht http mqtt pcm sjson sqlite3 tsl2561 u8glib ucglib websocket |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect that the u8g stuff doesn't work right -- see the current travis build failure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can confirm, linker fails with lots of undefined references to u8g functions when LUA_USE_MODULES_U8G
is defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devsaurus Arnim, re the warning: this has been there since eLua days, so this is a case of no change. The NodeMCU make only works under the Xtensa toolchain on Linux, so this is a global issue with our build. Maybe we should do an OS detect and abort on anything else, But this isn't an issue introduced by this patch.
@pjsg, Agreed and this needs fixing, so I'll have a look at the Travis build and work out what is going wrong, but the assumption in the normal make is that you enable a given module by setting the corresponding define in user_modulea.h
. This Travis make seems to be overriding this by a backdoor and forcing all modules into a link, despite the settings of these defines. This is a QA artiffact as such a biuld will never be capable of being downloaded onto a real ESP module. It isn't a real-life issue but one of the way the Travis build is configured. I just need to mirror this short circuit in the make and detect a Travis build and similarly force the build of all of the associated subdirs..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can confirm, linker fails with lots of undefined references to u8g functions when
LUA_USE_MODULES_U8G
is defined.
I'll take a look at this and fix. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My tweak to the magic works fine but
spiffsimg/spiffsimg -f ../bin/0x%x-4mb.img -S 4mb -U 0x7c2d0 -r ./spiffsimg/spiffs.lst -d
is dying with "Not enough space: fatal error" Why? swapping the -S 4mb for -c 0x8000 works fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I've bottomed this. The underlying issue is that most of these adjunct directories have the same root name as the module, e.g. coap
uses coap
, etc. The two u?g modules don't follow this pattern as u8g
uses u8glib
and ditto ucg
. I just had to add a couple of extra magic lines to handle this second case.
app/include/user_modules.h
Outdated
@@ -65,13 +57,13 @@ | |||
//#define LUA_USE_MODULES_SJSON | |||
//#define LUA_USE_MODULES_SNTP | |||
//#define LUA_USE_MODULES_SOMFY | |||
#define LUA_USE_MODULES_SPI | |||
//#define LUA_USE_MODULES_SPI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to change the defaults here? (SPI and TLS)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. I should back this out. but the set of settings in the default build needs to be revisited in a separate issue. E.g. I can't hink of any of the devices that we currently support in our modules
library that use SPI, so why do we include it by default? We include MQTT but not json. I'll make this change to minimize the out of scope side-effects, but this doesn't reduce the issue that that default set needs revisiting.
app/lua/Makefile
Outdated
@@ -24,7 +25,8 @@ STD_CFLAGS=-std=gnu11 -Wimplicit | |||
# makefile at its root level - these are then overridden | |||
# for a subtree within the makefile rooted therein | |||
# | |||
#DEFINES += | |||
#DEFINES += -DDEVELOPMENT_TOOLS -DDEVELOPMENT_USE_GDB -DNODE_DEBUG -DBREAK_ON_STARTUP_PIN=1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that these should be the defaults....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. These shouldn't be defaults, and that is why the char 1 on this line is a "#" char,, so the make ignores this line. However, the most common alternative to the default is going to be uncommenting this line by removing the # char, which is why I have included in in this form.
app/lua/lflash.c
Outdated
#define FLASH_PAGE_SIZE INTERNAL_FLASH_SECTOR_SIZE | ||
#define FLASH_PAGES (FLASH_SIZE/FLASH_PAGE_SIZE) | ||
|
||
#define BREAK_ON_STARTUP_PIN 1 // GPIO 5 or setting to 0 will disable pin startup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be settable from the user_config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, Good suggestion, We need to add this to the sectino on debugging and leave it prefixed by a //
to comment it out by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whatever, this define doesn't belong in lflash.c
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This entry needs removing.
The macro also remained to DEVELOPMENT_BREAK_ON_STARTUP_PIN
to have a unfied prefix with related macro constants.
if (vfs_lseek(fd, -1, VFS_SEEK_END) != fh.flash_size-1 || | ||
vfs_lseek(fd, 0, VFS_SEEK_SET) != 0) | ||
return 0; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a check to make sure that integer images only get loaded into integer runtimes.... (and vice versa)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hummm. Excellent suggestion as (as we know 😜) the VM will barf if an Integer LFS is loaded into a Float RTL or v.v. We could set one bit in the Flash ID in word 0 to to 0/1 depending on whether the underlying build is integer. I'll do this in the tweaks commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that you could have two different magic header numbers -- one for float and one for integer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having one bit flipped gives two different magic numbers, so I think we are basically saying the same thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also note that this reload function simply returns nil
if there is an issue (as opposed to not returning if all is OK). Would is be better to return some error code depending on the error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably would be good to return an error code....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that the call returns at all indicates a fatal error, but maybe we should just return a error message if we do return
@@ -79,6 +79,7 @@ | |||
#define VALUEWEAKBIT 4 | |||
#define FIXEDBIT 5 | |||
#define SFIXEDBIT 6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is SFIXEDBIT? Does it matter that LFS uses the same bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you look at the other bits, then you will see that bit 3 is also similarly denormalised. We've only got eight bits in the marked byte, and they've already all been allocated. Just as with FIXEDSTACK
, FINALIZED and
KEYWEAKmarkings, this isn't an issue in practice -- in that only
Protoand
TString` types can be LFS GCobjects and neither are super fixed.
That being said, it took a lot of head scratching and extra lua_assert
s to make sue that this is true. 😄
app/lua/lua.c
Outdated
} else { | ||
load->done = 1; | ||
need_dojob = true; | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't figure out what is going on here....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original logic is -- how should i put it -- tangled. To be honest I can't remember why I removed the break and instead fell through the the following continue.
What the original code did was to break on receiving an EOL whereas this continue to receive another char. I'll try both variants out and see which works the best.
Whatever, we've got whole blocks of code #if 0
ed out since being hacked out for the first project commit. IMO, if they don't do anything and will never be reintroduced then they should be removed so we can see the wood for the trees.
docs/en/modules/node.md
Outdated
`imageName` The of name of a image file in the filesystem to be loaded into the LFS. | ||
|
||
#### Returns | ||
_Not applicable_. The ESP will load the LFS image and immediately reboot. Control is not returned to the calling application. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a bunch of error conditions that cause this method to return.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quite possibly. The documentation all needs tightening up as well as adding lua_examples
directory for sampe LFS init modules. However, my suuggestion is that we should first reach a consensus on (a) whether this patch has enough merit to be include in dev, and (b) at what point do we swap to it LFS enabled being the default.
These documentation issues (whilst important) are still secondary to these decisions.
|
Looks like |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a partial review only, I'll try to pick up again tomorrow. You weren't kidding when you said it was big! :)
app/platform/common.c
Outdated
static uint32_t allocated = 0; | ||
static uint32_t phys_flash_used_end = 0; //Phyiscal address of last byte in last flash used sector | ||
|
||
uint32_t platform_flash_reserve_section( uint32_t regsize, uint32_t *start ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't take this the wrong way Terry, but I don't think this is the right approach. We've kept bolting on various flash-allocation bits for a long time now. I think we should bite the bullet and get an actual partition table in place, like what the ESP32 has. That way we can get a definitive view of what's where, and no risk of race conditions. Here it would appear that if there's any conditional allocation the entire allocation will shift and we'll get Undefined Behaviour at best...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Johny, One step too far on this patch. Sorry.
app/lua/lflash.c
Outdated
* - the lu_int16 offset of the next address pointer. | ||
*/ | ||
|
||
static int rebuild_core (int fd, uint32_t size, lu_int32 *buf) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would love to see this functionality more easily exposed (i.e. not static
for starters). The way we'll want to be using the LFS feature at $work is to pre-build everything on the outside and flash an image which doesn't even know how to do node.flash.reload()
.
I had initially anticipated just adding a flag to luac.cross
to produce a non-PIC image, but having looked at the format it'll probably be easier to write a separate utility which wraps rebuild_core
to post-process the flash.img
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment below. simpler to add -a
option to luac.cross
to produce an absolute version.
app/lua/lflash.c
Outdated
/* | ||
* Return a C closure pointing to the Flash Index function | ||
*/ | ||
LUAI_FUNC int luaN_index (lua_State *L) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now one needs to do node.flash.index()('somemodule')
. I don't see a good reason why this needs so much indirection? node.flash.index('somemodule')
would seem much more usable.
Having node.flash.index()
actually return an array or table rather than just do a printout (well, what node.flash.index()()
does), would open up for more programmatic options.
Am I missing something here which would prevent this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See answer to repeat of this below: tl;dr version: that's what Lua is for 😄
@@ -16,10 +17,22 @@ OBJDUMP = $(or $(shell which objdump),xtensa-lx106-elf-objdump) | |||
SPIFFSFILES ?= $(patsubst $(FSSOURCE)%,%,$(shell find $(FSSOURCE) -name '*' '!' -name .gitignore )) | |||
|
|||
################################################################# | |||
# Get the filesize of /bin/0x10000.bin | |||
# Get the filesize of /bin/0x10000.bin and SPIFFS sizing | |||
# | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a default build with LFS enabled (set to 64k), the spiffs image gets built as 0x70000-*mb.img
. Yet, the actual file system location is at 0xb0000
.
I think this just reinforces that a partition table would be a very good thing...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern with adding a partition table is that on the ESP8266, at least, it would be just that: a bolt-on addition, and not properly integrated into the flash loader. So how do you allocate partitions on the ESP8266. Do we try to adopt the same conventions as on the ESP32, and if so how do we integrate them?
IMO, it would be better have one of two apporaches:
- You can define the base and size of both SPIFFS and LFS as config defines. If you do then
- You can fix the absolute address of the LFS using the
-a
option, and ditto with spiffsimg user_main.c
should do some bounds checks to make sure that everything fits properly
- You can fix the absolute address of the LFS using the
- You don't in which case both fall back to using the platform flash allocator, and in this case there is nothing to stop the make process generate an internal table using the defines at (1) and the allocator uses these if set.
Incidentally you can set the LFS size to any multiple of 4Kb up to 256Kb.
The botch with SPIFFS is that the format is not PI, (though it could trivially be so). If it were then the SPIFFS images would also be PI. This would make life so much simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be happy enough to go with your approach, especially in the interest of getting LFS going. As long as the size/location gets set at compile time so we don't have any allocation races/ordering issues at runtime I think we're good.
As for spiffs, last I checked it's position independent - the phys/logic address translation takes place in the read/write accessor routines passed in to spiffs on initialisation. Not sure what makes you think it isn't position independent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what makes you think it isn't position independent?
spiffimg
requires you to specify an absolute location for the image. I'll meed to have a look at the source to see why it does this, but if SPIFFS is PI then spiffimg
should be as well. This would make life easier for everyone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the guy who wrote the initial spiffsimg
before @pjsg co-opted into NodeMCU: nah, that's not what that argument really says. That's just one way to do the max fs size calculation :)
I readily confess that it should've all been documented better though...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, the penny drops re the difference between the -c
and -S
, so yes, it should have been better documented 😁 In practice we have two use cases for a SPIFFS:
- The Lua app makes little use of SPIFFS, so it just needs to be "big enough" (because of the flashing and startup overhead). I typically use a 64Kb and work up from that in 64Kb multiples, simply because during development testing, it is often quicker to reflash the SPIFFS than use other ways to update it.
- The Lua developer isn't sure how much SPIFFS is going to used, so when in doubt then make it as big as possible.
I suspect that most developers start with (2) but if you reflash your firmware a lot then you end up doing (1).
app/platform/common.c
Outdated
static uint32_t allocated = 0; | ||
static uint32_t phys_flash_used_end = 0; //Phyiscal address of last byte in last flash used sector | ||
|
||
uint32_t platform_flash_reserve_section( uint32_t regsize, uint32_t *start ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't take this the wrong way Terry, but I really don't think this is the way to go. We've kept on bolting on various flash allocation bits over the years. I think it's time we bite the bullet and switch to a partition table like on the ESP32. That way we can easily tell where the spiffs image(s?) are, the LFS area, SDK reserved areas, etc.
With the approach here we're effectively introducing race/ordering problems where things may move around unexpectedly in flash. Having a partition table would keep things a lot cleaner and safer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment above. This is really an issue it it's own right and maybe worth tracking as such.
app/lua/lflash.c
Outdated
* - the lu_int16 offset of the next address pointer. | ||
*/ | ||
|
||
static int rebuild_core (int fd, uint32_t size, lu_int32 *buf) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd love for this to be more easily exposed (so not static
for starters). The way we'd like to use the LFS feature at $work is to pre-build the entire image and flash (or OTA-upgrade) that, with the actual Lua code not even knowing how to node.flash.reload()
.
I'd originally envisaged just adding a flag to luac.cross
to spit out non-PIC addresses, but having seen the format now, I'm thinking it would be easier to write a separate utility that post-processes the flash.img
from PIC to relocated-ready-to-be-dumped-into-flash format. That will need rebuild_core
though (possibly with flashBlock
passed in as a function pointer for flexibility).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that I mentioned somewhere that luac.cross
is so fast that it is easier just to add a -a
switch to fix the LFS at a fixed address, and run the luac.cross
multiple times if necessary.
This still leaves the need to do address calculation in your make. The alternative is to embed this functionality in a smarter esptool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as provisioning goes, I would class DiUS and me as advanced users, each with our own sweet spot.
In my case I have my own provisioning system. I very rarely want to change the base firmware (for example my home automation system is still running a build of dev from last Aug). But I do use my own provisioning system a lot. This probes a flag byte in RTCmem and if set then the ESP end does a reprovision on reboot. I've integrated this into my ESP apps as a reprovision command, so each ESP config has its own master directory hierarchy on the host provisioning server. If I want to make a change then I just change the master file on the server and issue a preprovision command to the ESP app, and bang -- magic happens -- and a few seconds later the ESP is rebooted with the new app running. I am still testing the LFS changes but this is all pretty transparent to the app -- other than LFS aallows you to code your apps in a simpler cleaner way because you don't have to worry about the code footprint in RAM.
However, all this does rely on you having a network path from each ESP to some provisioning server. So this isn't a good solution for a lot of embedded IoT uses.
The way that I currently do the bootstrap is to have a init.lua
stub which either runs the node.flash.reload()
if LFS isn't loaded or it chains into the init module in the LFS. From there on in, all encapsulation is done by Lua features.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you think a -a
switch is the better approach, sure, I don't mind.
app/lua/lflash.c
Outdated
/* | ||
* Return a C closure pointing to the Flash Index function | ||
*/ | ||
LUAI_FUNC int luaN_index (lua_State *L) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently one seems to need to do node.flash.index()('somemodule')
. Is there any real reason for that much indirection? It would seem much more "normal" to do node.flash.index('somemodule')
.
Also, I'd much prefer to have the index itself returned as a table ( [Edit: I'm an idiot, I was printing the return value]local t = node.flash.index()
) so it can be read programatically, compared to just printed out (what node.flash.index()()
currently does).
Am I missing something here that would prevent this simplification?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous fact that Lua ran from RAM and C ran from flash lead to the practice of "when in doubt, code it in C", even when a direct Lua approach is simpler and more transparent therefore easier to modify for the typical Lua developer. In this case it would be easier for the Lua application programmer to say lfs.someroutine(args)
to invoke someroutine
in the LFS. Yes, we could hard code this in C but this is only a couple of lines of Lua which would be in your unit routine in the LFS, so the only cludgy reference would be the first node.flash.index()'init'()
call and if we allowed the user to define LUA_INIT_STRING
in the config then with
#define LUA_INIT_STRING "pcall(function()node.flash.index()'init'()end)"
then they wouldn't even need to do that. It could even handle the special case of lfs.index()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the switch to LFS, I would also like to see a switch to the core dogma:
- Only use C if you can't do it in Lua or there are string performance reasons why it must be done in C.
Just think: change a line of C and you have to reflash the entire firmware. Change one line of Lua and you (or your provisioning subsystem) only needs to replace one SPIFFS file. A mega hassle vs. a few seconds.
For example we've got a C ds18b20 module which is slow and far too limited functionality. My Lua version is both richer and a lot faster. So why use C? The old answer was because C ran from flash so took up less RAM footprint. This argument no longer applies.
As to this LFS init routine, if you do a -p
option on the luac.cross
then you will see that this doesn't use a table which some C routine parses, the code is a directly emitted Lua function.
And as I have said elsewhere is really that hard to include something like the following in the LFS:init module?
do
local ndx = node.flash.index
local lfs_t = { __index = function(_, name)
local fn = ndx(name)
if type(fn) == 'function' then return fn end -- or return nil implied
end}
getfenv().lfs = setmetatable(lfs_t,lfs_t)
end
The RAM footprint is a table with a single entry in it and a LClosure with a single upval. The LFS footprint is a code fragment of 16 instructions and a function of 10 instructions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You earlier argued that both mine and your uses were "advanced" (probably rightly so), so I think I'll level that same comment back at you here :) Meta tables aren't exactly the most obvious things when one starts out with Lua.
I personally was looking at node.flash.index
as a drop-in replacement for dofile
(or even require
). It's something existing developers are familiar with both concept wise and syntax wise, and it matches how the files/modules get interned into the LFS. Where before you would've mymod = dofile('mymod.lc')
you could (assuming suggested change) mymod = node.flash.index('mymod')
. To me that would be a nice path into the LFS.
To address the dogma shift, yes, I agree in principle, but we'd need to provide an easy way for users to build the flash.img
. Right now Marcel's cloudbuilder is saving everyone from needing their own build environment, which is fabulous from a usability perspective. It does make it hard to start using LFS with this though. Say @marcelstoer, do you think it would be feasible to also have an LFS builder service? A user uploads their .lua
files and we spit out a either a PIC flash.img
to be node.flash.reload()
ed or an lfs.bin
to be esptool
'd straight onto the device? If I had time and somewhere to host it I could probably put something together (insert bemoaning of lack of time here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is my init.lua
if file.rename('flash.img', 'loaded.img') then
node.flash.reload('loaded.img')
end
node.flash.index()('init')()
Then my init.lua
in the ROM starts with
local index = node.flash.index and node.flash.index()
local function loader_flash(module)
local r = index(module)
return type(r) == 'function' and r -- or nil otherwise
end
if index then package.loaders[3] = loader_flash end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pjsq Maybe you can explain what I'm missing here? Why do we need to do
local index = node.flash.index and node.flash.index()
rather than
local index = node.flash.index
What feature is gained by requiring that function call? Why can't that be hidden inside the node.flash.index
function itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmattsson this and
style is because
- if the LFS isn't compiled in then
node.flash
returnsnil
and so evaluatingnode.flash.index
will throw an exception, causing a panic. - if the LFS is compiled in, but the LFS isn't loaded, then
node.flash.index
returnsnil
and so evaluatingnode.flash.index()
will throw an exception, causing a panic. In the case of running from LFS, thenode.flash.index and
is redundant because this code can assume that LFS is both compiled and loaded, and this makes life a lot simpler.
My preferred init.lua
approach is:
pcall(function() local f=node.flash; return f.index and f.index()('init')() or f.reload('.img') end)
which is guaranteed not to panic. and I do the update reload using my previsioning system.
I chose the Lua route because (a) it is simple; and (b) it leaves it to the individual developer / project to implement their own Lua encapsulation that's right for them. This got the patch out there to evaluate. I don't think that using meta tables is that much of an issue as we can include a default in Lua examples and refer to it in the API reference. Novice Lua programmers can use it as-is and more advanced ones can use it as a starting point to modify.
I will give an analogy: Enduser_setup is written in C, by I regularly see Lua developers post issues "I don't know how to write C modules, but how do I modify it to do XYZ?" If it was in Lua, then they could just tweak the code.
This all being said, changing the API is pretty straight forward, so long as we first agree what we want to change it to. Let's pick up this general point on #2292
Something's not right here. Default build of this PR, with LFS enabled (still 64k).
Build, flash,
Why isn't the first call to my |
@@ -538,7 +555,7 @@ static void atomic (lua_State *L) { | |||
size_t udsize; /* total size of userdata to be finalized */ | |||
/* remark occasional upvalues of (maybe) dead threads */ | |||
remarkupvals(g); | |||
/* traverse objects cautch by write barrier and by 'remarkupvals' */ | |||
/* traverse objects caucht by write barrier and by 'remarkupvals' */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you mean "caught" 🤣
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit naff replacing one spelling error by another !!
And I'll take a look at your previous test case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmattsson, re your previous example, this is a mindfart on your part. 😀 It does exactly what I would expect it to do, and would have the same result as wtf=loadfile('wtf.lua')
is this was in the SPIFFS. Do a luac.cross -l -o /dev/null local/lus/wtf.lua
to see what this generates:
- The first execution executes the following with replaces the global
wtf
with the closure:
1 [4] CLOSURE 0 0 ; 0x258b2d0
2 [1] SETGLOBAL 0 -1 ; wtf
3 [4] RETURN 0 1
wtf
now points to the closed function so the second call executes the following to give you the response you expected.
1 [2] GETGLOBAL 1 -1 ; print
2 [2] MOVE 2 0
3 [2] CALL 1 2 1
4 [3] LOADK 1 -2 ; 1
5 [3] RETURN 1 2
Try replacing wtf.lua
by the following and it will do what you expect:
-- function(...)
local arg = ....
print(arg)
return 1
-- end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Umm.. what? I'm not reassigning wtf between the invocations. I'm assigning the global wtf
exactly once, then calling it twice. I sure as sundown don't expect my wtf
variable/function to be reassigned just because I invoked it (given I don't reassign it in the function). Kindly explain my supposed mindfart further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, you are getting caucht (joke) is the subtleties of Lua. If you've got 15 mins, then just Skupe me or ping me an email and I'll Skype you.
As I said, if you put the same file in SPIFFS and did a wtf=loadfile('wtf.lua')
then you'd see the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As Terry promised me: ./facepalm
Okay, the penny dropped:function wtf() end
is of course the same as wtf = function() end
, and the initial wtf
was the file/module closure itself.
/* we could define it this way */ | ||
#define setttype(obj, _tt) ( ttype_sig(obj) = add_sig(_tt) ) | ||
#endif // #ifndef LUA_PACK_VALUE | ||
#define setttype(obj, stt) ((void) (obj)->value, (obj)->tt = (stt)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm? Are you trying to silence some compiler warning here or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two tt
fields, one in the TValue and (in the case of GCObjects) one in the GCO header, and the code should not confuse the two. (Lua 5.3 does this by renaming the latter to tt_
.) The ttype(o)
and setttype(o,t)
macros should only be used on the TValue type.
The (void) (obj)->value
guard is compiled but discarded during code optimisation, but it still throws a compile error if these are used on the GCOs, and this allowed me to fix a few cases that had been missed in the eLua changes.
The reason that this is important for LFS is that the TValue form is a 32 bit integer and so testing this doesn't throw a l8ui
exception. What I intend to do is to replace the relatively small number of GCO tt
tests by a macro form which generates the extra extui
instruction and avoids the unaligned exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aaaah, that makes sense. Cool hack. Maybe worth leaving a comment along those lines there for future reference? E.g. // guard against confusion between GCO's and TValue's "tt" field by referencing the "value" member too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact I just copied the same hack from somewhere else. I should really back it out or make it a //#define
variant because I really only need this when i need to do type checking in macros and I suspect that it does leave the code in in -O0
.
app/lua/lua.c
Outdated
@@ -468,11 +324,8 @@ int lua_main (int argc, char **argv) { | |||
|
|||
void lua_handle_input (bool force) | |||
{ | |||
while (gLoad.L && (force || readline (&gLoad))) | |||
{ | |||
if (gLoad.L && (force || readline (&gLoad))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoa, this change makes me nervous. I explicitly changed it to a while
to fix the issue of lost commands on the serial. If more than one line buffers up the LVM would fall behind in its input processing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I can't remember why I did this. I will take a look at the history. My first pass through merged in the changes from a vanilla Lua-5.1 variant and I may just have missed this in the merge.
But the whole interactive interface is very flaky anyway. It is really designed for a more POSIX-like environment. The way it determines compilation units is to stack the lies and try:
- line 1
- line 1 + line 2
- line 1 + line 2 + line 3
- ...
Until the sequence compiles. It then executes the brick and treats the next line as line 1. In the meantime there is no flow control (hardware or XON/XOFF) on the input uart so it is very easy to overflow the 128 byte input FIFO on the UART, with or without this change.
Also the whole of the interactive interface including the input / output redirector is extremely flaky.
However all of this is separate to this change so I will back this one out here.
app/platform/platform.c
Outdated
@@ -879,7 +879,7 @@ uint32_t platform_s_flash_write( const void *from, uint32_t toaddr, uint32_t siz | |||
if(SPI_FLASH_RESULT_OK == r) | |||
return size; | |||
else{ | |||
NODE_ERR( "ERROR in flash_write: r=%d at %08X\n", ( int )r, ( unsigned )toaddr); | |||
NODE_ERR( "ERROR in flash_write: r=%d at %p\n", ( int )r, ( unsigned )toaddr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually have %p support in the printf? I thought we didn't (but I could be getting myself confused with one of the other platforms I work on).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fairly recent, but still already there: #2062
app/platform/platform.c
Outdated
uint32_t platform_flash_mapped2phys (uint32_t mapped_addr) | ||
{ | ||
uint32_t meg = flash_map_meg_offset(); | ||
return (meg&1) ? -1 : mapped_addr - INTERNAL_FLASH_MAPPED_ADDRESS + meg ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(meg <= 0)
a bit clearer as to what the checks here (and a few lines further down) do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The platform API already had the mapping one way. I just added the inverse map using the same checks.
} | ||
asm ("break 1, 1"); | ||
asm ("break 1, 1"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surplus whitespace, or intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surplus
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd lose the while(1) {} and replace it by a while around the break. But a job for another day.
The break will instantly reboot the ESP if the debugger exception handler is not installed If is then you will enter the debugger.
/* | ||
* Flash memory is a fixed memory addressable block that is serially allocated by the | ||
* luac build process and the out image can be downloaded into SPIFSS and loaded into | ||
* flash with a node.flash.load() command. See luac_cross/lflashimg.c for the build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick, the command is currently node.flash.reload()
, and subject to iteration :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, but as we discussed these two might be better stripped back. Whatever ... will pick this up on the sweep.
Here's one way of solving the build-fails-if-no-local/lua-files-exist issue:
|
Espressif claims they squeezed 17KB extra memory out of their RTOS SDK 2.0...niice. Regardless of what we do here it'd be nice if they gave the NON-OS SDK just as much love. |
Yup, but when you go through the details in their documentation, then you see that the way that they do this is to drop the instruction cache from 32K to 16K and use the extra 16K freed by doing this as heap. |
So I have been working through the correction list for updates to this patch. The update list is far larger than the majority of other PRs !! So what I am proposing to do is to rebasline the patch so that incorporates two commits:
This might force me to close and reopen this PR, but let's see. Note that this will also implement the eLua bugifix and the rest of #2333 so that as well as being able to move out all of the program and string data into LFS, the remaining RAM-base TValue-related resources will be 75% of the footprint (if you define @jmattsson, I am surprised that you haven't picked up and commented on this little gem! 😄 |
@TerryE Why? I iz veri bizi... :( As for rebaselining the patch, I'd recommend not doing it. Github does a good job of remembering which version comments were applied to, so I think you might end up complicating things while trying to simplify them. Just push more commits here as-is; we can always tidy up if needed. (And geez, stop taking all the fruit at once - surely we'll want to be able to find these juicy bits in the future too and feel all good about further freeing up memory! ;-P ) |
Did you look into that already? |
Is the Pope catholic? 😉 |
@@ -1,37 +1,206 @@ | |||
#ifndef __USER_CONFIG_H__ | |||
#define __USER_CONFIG_H__ | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With all the relevant documentation now inside this file could you maybe add a note with a link to it to the first paragraph at https://nodemcu.readthedocs.io/en/latest/en/build/#build-options?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do 😄
I've just found a bug when porting my latest telnet version to LFS. (It works fine on the latest master.) This LFS version fails on node.task.post(function()node.input "='hello'"; end) Sorry guys. |
I've just found a bug when porting my latest telnet version to LFS. (It works fine on the latest master.) This LFS version fails on node.task.post(function()node.input "='hello'"; end) Sorry guys. PS. Now fixed. |
As some of you might have noticed, I may seem to have been side-tracked working on getting LFS-optimised versions of my provisioning system, together with the telnet and ftp servers. But there was some rationale to this: I wanted to hammer LFS myself in a real development context before pushing the "final" PR to Another observation: @jmattsson was absolutely spot-on in requesting an "absolute" option for LFS images. I use a script file to rebuild my LFS and reimage it directly to a test ESP module. The whole cycle takes less than 10 seconds, so I find it easier just to develop directly in LFS and not even bother with moving in-test modules onto SPIFFS, because it is simpler and faster to work directly with all modules in LFS. I do use the relocatable version but only for production luaOTA applications. |
Is there a rough ETA for LFS to land in DEV? |
I've got about a day's work and it's next on my TODO list. 😚 |
Sorry for the delay guys. I am trying to track down a weird bug to do with the new method of SPIFFS and LFS inter-operating though linker-based static allocation.
Lot's of head scratching. Just waiting for that "oh shit, that's it" moment. |
@TerryE forget to move the Try running a delta against the branch I linked above some time ago. I had both LFS & SPIFFS going happily with static allocation. |
I will have my own palm slap when I find it. Nothing so obvious. It's a sod. If I write then read verify the flash, then it's OK. But if I run the firmware and read it gets stomped. I've added a temp #ifdef code so that I can set the LFS to 0 len and that works but 64K and it runs the same exec path (basically a no-op if the flash signature isn't correct) and the stomp occurs. And very early. I'll find it today. |
OK, definitely a head scratch here.
PS. This appears to be @devsaurus Arnim's patch #1968 to the |
...if you don't, then the SDK uses whatever is in there to look for its config data, and if there's a bad checksum writes new config blocks there (last three sectors, as determined by the flash header). We used to catch an incorrect flash size byte before we got into the SDK to avoid just such a problem, so maybe that's gotten broken somehow? You haven't accidentally stopped the early init stuff from being called, have you? (e.g. by dropping the custom entry point setting) |
@jmattsson Johny, I've just noticed that we cross posted, having come to the same conclusion. See my PS to my previous comment. But thanks for the input :) |
Right, so we need to move the fixing-up of the size-byte into the trampoline too to ensure we have everything in order before the SDK can run? Not happy about the extra code in IRAM in that case, but maybe it's necessary. |
OK, take it apart guys. Please note #2407. The main changes are the incorporation on all of the review feedback, the merge with dev to resolve parallel changes to
Also note that because the LFS region is inside the 0x10000.bin file, you can't flash the bin file in the same # Pick the LFS region address from the ELF image
export LFS_BASE=$(xtensa-lx106-elf-objdump -t ../app/.output/eagle/debug/image/eagle.app.v6.out |grep flash_region_base| cut -b 4-8)
# Compile the LFS sources into the correct absolute bin LFS image
luac.cross -a 0x402$LFS_BASE -o $BIN/lfs-0x$LFS_BASE.img $LFS_SOURCE/*.lua
# Flash the firmware if needed. Note the no_reset flag
esptool --port /dev/ttyUSB0 --baud 460800 --after no_reset write_flash -fm dio 0x00000 $BIN/0x00000.bin 0x10000 $BIN/0x10000.bin
# Now flash the LFS (and optionally a new SPIFFS)
esptool --port /dev/ttyUSB0 --baud 460800 write_flash -fm dio 0x$LFS_BASE $BIN/lfs-0x$LFS_BASE.img This takes about 7sec to reimage the ESP firmware and LFS with the new Espressif |
@cwrseck @devsaurus @djphoenix @dnc40085 @dtran123 @drawkula @georeb You have all commented on / reviewed this LFS development in the past. I would very much like to merge this PR into We still need to add cloud-builder support, but even so it is going to be a lot more useful having the base PR in |
Aside from the error |
I was hoping to have had a chance to test out the latest at work this week, but it's looking increasingly unlikely that I'll find time. I was pretty happy overall with it at the last pass, and I've had a quick look through the latest commits and don't see anything that makes be go "eep!" :) |
I'll try to test drive the latest commits tonight. |
We've got other bits to add once it's in dev. For example:
|
Looks good 👍 |
OK, unless anyone shouts, I will do the merge in ~12 hrs time. I also need to do the script changes so that Marcel can add support for LFS in cloud builder. |
See #2292.
This is a BIG patch
dev
branch rather than formaster
.docs/en/*
.See the discussion in #2292. This PR:
user_config.h
).luac.cross
into the main make and removes the need for a Lua install on the host.I and @pjsg have build and tested this in light use for both float and integer builds. It probably needs a lot more testing before we switch on LFS as default, and as of this request, it can be configured as a build option.
I'll be moving my own home automation system over to this in the next few weeks. The days of 1,000 line+ Lua applications on the ESP8266 have arrived.
I don't recommend merging until the reviews and other contributors have evaluated this patched functionality and performance.