LFS evaluation version -- second release #2301

TerryE · 2018-03-17T01:08:01Z

This is a BIG patch

This PR is for the dev branch rather than for master.
This PR is compliant with the other contributing guidelines as well (if not, please describe why).
I have tested my contribution (see qualification below).
The code changes are reflected in the documentation at docs/en/*.

See the discussion in #2292. This PR:

Implements LFS for both integer and float builds. (Though note that this is disabled with the default user_config.h).
Moves luac.cross into the main make and removes the need for a Lua install on the host.
Optimises the make so that options libraries such as sqlite3 are only compiled if the coreesponding module is enabled.
Fixed the remote GDB function and makes it usable.

I and @pjsg have build and tested this in light use for both float and integer builds. It probably needs a lot more testing before we switch on LFS as default, and as of this request, it can be configured as a build option.

I'll be moving my own home automation system over to this in the next few weeks. The days of 1,000 line+ Lua applications on the ESP8266 have arrived.

I don't recommend merging until the reviews and other contributors have evaluated this patched functionality and performance.

devsaurus · 2018-03-17T16:47:46Z

Linker warning for app/lua/luac_cross:

.output/host/release/obj/loslib.o: In function `os_tmpname':
loslib.c:(.text+0x28c): warning: the use of `tmpnam' is dangerous, better use `mkstemp'

tmpnam originates from luaconf.h and it appears that LUA_USE_POSIX is not defined. The latter would be set by LUA_USE_LINUX which is also undef. Is there a host OS detection missing in the make flow?

pjsg · 2018-03-17T14:36:58Z

app/Makefile

@@ -20,93 +20,64 @@ FLAVOR = debug
 ifndef PDIR # {
 GEN_IMAGES= eagle.app.v6.out
 GEN_BINS= eagle.app.v6.bin
-SPECIAL_MKTARGETS=$(APP_MKTARGETS)
+OPT_MKTARGETS      := coap crypto dht http mqtt pcm sjson sqlite3 tsl2561 u8glib ucglib websocket


I suspect that the u8g stuff doesn't work right -- see the current travis build failure

Can confirm, linker fails with lots of undefined references to u8g functions when LUA_USE_MODULES_U8G is defined.

@devsaurus Arnim, re the warning: this has been there since eLua days, so this is a case of no change. The NodeMCU make only works under the Xtensa toolchain on Linux, so this is a global issue with our build. Maybe we should do an OS detect and abort on anything else, But this isn't an issue introduced by this patch.

@pjsg, Agreed and this needs fixing, so I'll have a look at the Travis build and work out what is going wrong, but the assumption in the normal make is that you enable a given module by setting the corresponding define in user_modulea.h. This Travis make seems to be overriding this by a backdoor and forcing all modules into a link, despite the settings of these defines. This is a QA artiffact as such a biuld will never be capable of being downloaded onto a real ESP module. It isn't a real-life issue but one of the way the Travis build is configured. I just need to mirror this short circuit in the make and detect a Travis build and similarly force the build of all of the associated subdirs..

Can confirm, linker fails with lots of undefined references to u8g functions when LUA_USE_MODULES_U8G is defined.

I'll take a look at this and fix. :)

My tweak to the magic works fine but

spiffsimg/spiffsimg -f ../bin/0x%x-4mb.img -S 4mb -U 0x7c2d0 -r ./spiffsimg/spiffs.lst -d

is dying with "Not enough space: fatal error" Why? swapping the -S 4mb for -c 0x8000 works fine.

OK, I've bottomed this. The underlying issue is that most of these adjunct directories have the same root name as the module, e.g. coap uses coap, etc. The two u?g modules don't follow this pattern as u8g uses u8glib and ditto ucg. I just had to add a couple of extra magic lines to handle this second case.

pjsg · 2018-03-17T14:39:44Z

app/include/user_modules.h

@@ -65,13 +57,13 @@
 //#define LUA_USE_MODULES_SJSON
 //#define LUA_USE_MODULES_SNTP
 //#define LUA_USE_MODULES_SOMFY
-#define LUA_USE_MODULES_SPI
+//#define LUA_USE_MODULES_SPI


Did you mean to change the defaults here? (SPI and TLS)

Correct. I should back this out. but the set of settings in the default build needs to be revisited in a separate issue. E.g. I can't hink of any of the devices that we currently support in our modules library that use SPI, so why do we include it by default? We include MQTT but not json. I'll make this change to minimize the out of scope side-effects, but this doesn't reduce the issue that that default set needs revisiting.

pjsg · 2018-03-17T14:40:23Z

app/lua/Makefile

@@ -24,7 +25,8 @@ STD_CFLAGS=-std=gnu11 -Wimplicit
 #   makefile at its root level - these are then overridden
 #   for a subtree within the makefile rooted therein
 #
-#DEFINES += 
+#DEFINES += -DDEVELOPMENT_TOOLS -DDEVELOPMENT_USE_GDB -DNODE_DEBUG -DBREAK_ON_STARTUP_PIN=1


I don't think that these should be the defaults....

Correct. These shouldn't be defaults, and that is why the char 1 on this line is a "#" char,, so the make ignores this line. However, the most common alternative to the default is going to be uncommenting this line by removing the # char, which is why I have included in in this form.

pjsg · 2018-03-17T14:46:05Z

app/lua/lflash.c

+#define FLASH_PAGE_SIZE INTERNAL_FLASH_SECTOR_SIZE
+#define FLASH_PAGES  (FLASH_SIZE/FLASH_PAGE_SIZE)
+
+#define BREAK_ON_STARTUP_PIN  1  // GPIO 5 or setting to 0 will disable pin startup 


This should be settable from the user_config

Yes, Good suggestion, We need to add this to the sectino on debugging and leave it prefixed by a // to comment it out by default.

Whatever, this define doesn't belong in lflash.c.

This entry needs removing.

The macro also remained to DEVELOPMENT_BREAK_ON_STARTUP_PIN to have a unfied prefix with related macro constants.

pjsg · 2018-03-17T14:51:24Z

app/lua/lflash.c

+  if (vfs_lseek(fd, -1, VFS_SEEK_END) != fh.flash_size-1 ||
+      vfs_lseek(fd, 0, VFS_SEEK_SET) != 0)
+    return 0;
+


Can you add a check to make sure that integer images only get loaded into integer runtimes.... (and vice versa)

Hummm. Excellent suggestion as (as we know 😜) the VM will barf if an Integer LFS is loaded into a Float RTL or v.v. We could set one bit in the Flash ID in word 0 to to 0/1 depending on whether the underlying build is integer. I'll do this in the tweaks commit.

I was thinking that you could have two different magic header numbers -- one for float and one for integer

Having one bit flipped gives two different magic numbers, so I think we are basically saying the same thing.

I also note that this reload function simply returns nil if there is an issue (as opposed to not returning if all is OK). Would is be better to return some error code depending on the error?

Probably would be good to return an error code....

The fact that the call returns at all indicates a fatal error, but maybe we should just return a error message if we do return

pjsg · 2018-03-17T14:54:17Z

app/lua/lgc.h

@@ -79,6 +79,7 @@
 #define VALUEWEAKBIT	4
 #define FIXEDBIT	5
 #define SFIXEDBIT	6


What is SFIXEDBIT? Does it matter that LFS uses the same bit?

If you look at the other bits, then you will see that bit 3 is also similarly denormalised. We've only got eight bits in the marked byte, and they've already all been allocated. Just as with FIXEDSTACK , FINALIZED and KEYWEAKmarkings, this isn't an issue in practice -- in that onlyProtoandTString` types can be LFS GCobjects and neither are super fixed.

That being said, it took a lot of head scratching and extra lua_asserts to make sue that this is true. 😄

pjsg · 2018-03-17T18:47:55Z

app/lua/lua.c

        } else {
          load->done = 1;
          need_dojob = true;
-          break;


I can't figure out what is going on here....

The original logic is -- how should i put it -- tangled. To be honest I can't remember why I removed the break and instead fell through the the following continue.

What the original code did was to break on receiving an EOL whereas this continue to receive another char. I'll try both variants out and see which works the best.

Whatever, we've got whole blocks of code #if 0ed out since being hacked out for the first project commit. IMO, if they don't do anything and will never be reintroduced then they should be removed so we can see the wood for the trees.

pjsg · 2018-03-17T18:58:55Z

docs/en/modules/node.md

+`imageName` The of name of a image file in the filesystem to be loaded into the LFS.
+
+#### Returns
+_Not applicable_.  The ESP will load the LFS image and immediately reboot.  Control is not returned to the calling application.


There are a bunch of error conditions that cause this method to return.

Quite possibly. The documentation all needs tightening up as well as adding lua_examples directory for sampe LFS init modules. However, my suuggestion is that we should first reach a consensus on (a) whether this patch has enough merit to be include in dev, and (b) at what point do we swap to it LFS enabled being the default.

These documentation issues (whilst important) are still secondary to these decisions.

jmattsson · 2018-03-20T03:59:06Z

make clean appears to do some dependency generation for luac_cross? I don't see any actual files, but a directory structure springs into existing under app/lua/luac_cross/.output

jmattsson · 2018-03-20T04:07:47Z

make[2]: Entering directory '/home/johny/src/DiUS/nodemcu-firmware/tools/spiffsimg'
g -Wall -Wextra -Wno-unused-parameter -Wno-unused-function -I. -I../../app/spiffs -I../../app/include -DNODEMCU_SPIFFS_NO_INCLUDE --include spiffs_typedefs.h -Ddbg_printf=printf main.c ../../app/spiffs/spiffs_cache.c ../../app/spiffs/spiffs_check.c ../../app/spiffs/spiffs_gc.c ../../app/spiffs/spiffs_hydrogen.c ../../app/spiffs/spiffs_nucleus.c  -o spiffsimg
make[2]: g: Command not found
Makefile:8: recipe for target 'spiffsimg' failed
make[2]: [spiffsimg] Error 127 (ignored)

Looks like $(CC) has been unset by the time spiffsimg is being built

jmattsson · 2018-03-20T04:13:25Z

# echo 'print("Hello, Lua!")' > local/fs/init.lua
# make
....
make[1]: *** No rule to make target '../local/lua/*.lua', needed by 'LFSimage'.  Stop.

jmattsson

This is a partial review only, I'll try to pick up again tomorrow. You weren't kidding when you said it was big! :)

jmattsson · 2018-03-20T05:11:10Z

app/platform/common.c

+static uint32_t allocated = 0;
+static uint32_t phys_flash_used_end = 0;  //Phyiscal address of last byte in last flash used sector 
+
+uint32_t platform_flash_reserve_section( uint32_t regsize, uint32_t *start )


Don't take this the wrong way Terry, but I don't think this is the right approach. We've kept bolting on various flash-allocation bits for a long time now. I think we should bite the bullet and get an actual partition table in place, like what the ESP32 has. That way we can get a definitive view of what's where, and no risk of race conditions. Here it would appear that if there's any conditional allocation the entire allocation will shift and we'll get Undefined Behaviour at best...

Johny, One step too far on this patch. Sorry.

jmattsson · 2018-03-20T05:19:50Z

app/lua/lflash.c

+ *  -  the lu_int16 offset of the next address pointer. 
+ */
+
+static int rebuild_core (int fd, uint32_t size, lu_int32 *buf) {


I would love to see this functionality more easily exposed (i.e. not static for starters). The way we'll want to be using the LFS feature at $work is to pre-build everything on the outside and flash an image which doesn't even know how to do node.flash.reload().

I had initially anticipated just adding a flag to luac.cross to produce a non-PIC image, but having looked at the format it'll probably be easier to write a separate utility which wraps rebuild_core to post-process the flash.img.

See comment below. simpler to add -a option to luac.cross to produce an absolute version.

jmattsson · 2018-03-20T06:59:51Z

app/lua/lflash.c

+/*
+ * Return a C closure pointing to the Flash Index function
+ */
+LUAI_FUNC int luaN_index (lua_State *L) {


Right now one needs to do node.flash.index()('somemodule'). I don't see a good reason why this needs so much indirection? node.flash.index('somemodule') would seem much more usable.

Having node.flash.index() actually return an array or table rather than just do a printout (well, what node.flash.index()() does), would open up for more programmatic options.

Am I missing something here which would prevent this?

See answer to repeat of this below: tl;dr version: that's what Lua is for 😄

jmattsson · 2018-03-20T07:09:50Z

tools/Makefile

@@ -16,10 +17,22 @@ OBJDUMP = $(or $(shell which objdump),xtensa-lx106-elf-objdump)
 SPIFFSFILES ?= $(patsubst $(FSSOURCE)%,%,$(shell find $(FSSOURCE) -name '*' '!' -name .gitignore ))

 #################################################################
-# Get the filesize of /bin/0x10000.bin
+# Get the filesize of /bin/0x10000.bin and SPIFFS sizing
 #



On a default build with LFS enabled (set to 64k), the spiffs image gets built as 0x70000-*mb.img. Yet, the actual file system location is at 0xb0000.

I think this just reinforces that a partition table would be a very good thing...

My concern with adding a partition table is that on the ESP8266, at least, it would be just that: a bolt-on addition, and not properly integrated into the flash loader. So how do you allocate partitions on the ESP8266. Do we try to adopt the same conventions as on the ESP32, and if so how do we integrate them?

IMO, it would be better have one of two apporaches:

You can define the base and size of both SPIFFS and LFS as config defines. If you do then

You can fix the absolute address of the LFS using the -a option, and ditto with spiffsimg

user_main.c should do some bounds checks to make sure that everything fits properly

You don't in which case both fall back to using the platform flash allocator, and in this case there is nothing to stop the make process generate an internal table using the defines at (1) and the allocator uses these if set.

Incidentally you can set the LFS size to any multiple of 4Kb up to 256Kb.

The botch with SPIFFS is that the format is not PI, (though it could trivially be so). If it were then the SPIFFS images would also be PI. This would make life so much simpler.

I'd be happy enough to go with your approach, especially in the interest of getting LFS going. As long as the size/location gets set at compile time so we don't have any allocation races/ordering issues at runtime I think we're good.

As for spiffs, last I checked it's position independent - the phys/logic address translation takes place in the read/write accessor routines passed in to spiffs on initialisation. Not sure what makes you think it isn't position independent?

Not sure what makes you think it isn't position independent?

spiffimg requires you to specify an absolute location for the image. I'll meed to have a look at the source to see why it does this, but if SPIFFS is PI then spiffimg should be as well. This would make life easier for everyone.

As the guy who wrote the initial spiffsimg before @pjsg co-opted into NodeMCU: nah, that's not what that argument really says. That's just one way to do the max fs size calculation :)
I readily confess that it should've all been documented better though...

Ah, the penny drops re the difference between the -c and -S, so yes, it should have been better documented 😁 In practice we have two use cases for a SPIFFS:

The Lua app makes little use of SPIFFS, so it just needs to be "big enough" (because of the flashing and startup overhead). I typically use a 64Kb and work up from that in 64Kb multiples, simply because during development testing, it is often quicker to reflash the SPIFFS than use other ways to update it.

The Lua developer isn't sure how much SPIFFS is going to used, so when in doubt then make it as big as possible.

I suspect that most developers start with (2) but if you reflash your firmware a lot then you end up doing (1).

jmattsson · 2018-03-20T07:13:51Z

app/platform/common.c

+static uint32_t allocated = 0;
+static uint32_t phys_flash_used_end = 0;  //Phyiscal address of last byte in last flash used sector 
+
+uint32_t platform_flash_reserve_section( uint32_t regsize, uint32_t *start )


Don't take this the wrong way Terry, but I really don't think this is the way to go. We've kept on bolting on various flash allocation bits over the years. I think it's time we bite the bullet and switch to a partition table like on the ESP32. That way we can easily tell where the spiffs image(s?) are, the LFS area, SDK reserved areas, etc.

With the approach here we're effectively introducing race/ordering problems where things may move around unexpectedly in flash. Having a partition table would keep things a lot cleaner and safer.

See comment above. This is really an issue it it's own right and maybe worth tracking as such.

jmattsson · 2018-03-20T07:21:26Z

app/lua/lflash.c

+ *  -  the lu_int16 offset of the next address pointer. 
+ */
+
+static int rebuild_core (int fd, uint32_t size, lu_int32 *buf) {


I'd love for this to be more easily exposed (so not static for starters). The way we'd like to use the LFS feature at $work is to pre-build the entire image and flash (or OTA-upgrade) that, with the actual Lua code not even knowing how to node.flash.reload().

I'd originally envisaged just adding a flag to luac.cross to spit out non-PIC addresses, but having seen the format now, I'm thinking it would be easier to write a separate utility that post-processes the flash.img from PIC to relocated-ready-to-be-dumped-into-flash format. That will need rebuild_core though (possibly with flashBlock passed in as a function pointer for flexibility).

I think that I mentioned somewhere that luac.cross is so fast that it is easier just to add a -a switch to fix the LFS at a fixed address, and run the luac.cross multiple times if necessary.

This still leaves the need to do address calculation in your make. The alternative is to embed this functionality in a smarter esptool.

As far as provisioning goes, I would class DiUS and me as advanced users, each with our own sweet spot.

In my case I have my own provisioning system. I very rarely want to change the base firmware (for example my home automation system is still running a build of dev from last Aug). But I do use my own provisioning system a lot. This probes a flag byte in RTCmem and if set then the ESP end does a reprovision on reboot. I've integrated this into my ESP apps as a reprovision command, so each ESP config has its own master directory hierarchy on the host provisioning server. If I want to make a change then I just change the master file on the server and issue a preprovision command to the ESP app, and bang -- magic happens -- and a few seconds later the ESP is rebooted with the new app running. I am still testing the LFS changes but this is all pretty transparent to the app -- other than LFS aallows you to code your apps in a simpler cleaner way because you don't have to worry about the code footprint in RAM.

However, all this does rely on you having a network path from each ESP to some provisioning server. So this isn't a good solution for a lot of embedded IoT uses.

The way that I currently do the bootstrap is to have a init.lua stub which either runs the node.flash.reload() if LFS isn't loaded or it chains into the init module in the LFS. From there on in, all encapsulation is done by Lua features.

If you think a -a switch is the better approach, sure, I don't mind.

jmattsson · 2018-03-20T07:24:18Z

app/lua/lflash.c

+/*
+ * Return a C closure pointing to the Flash Index function
+ */
+LUAI_FUNC int luaN_index (lua_State *L) {


Currently one seems to need to do node.flash.index()('somemodule'). Is there any real reason for that much indirection? It would seem much more "normal" to do node.flash.index('somemodule').

Also, I'd much prefer to have the index itself returned as a table (local t = node.flash.index()) so it can be read programatically, compared to just printed out (what node.flash.index()() currently does). [Edit: I'm an idiot, I was printing the return value]

Am I missing something here that would prevent this simplification?

The previous fact that Lua ran from RAM and C ran from flash lead to the practice of "when in doubt, code it in C", even when a direct Lua approach is simpler and more transparent therefore easier to modify for the typical Lua developer. In this case it would be easier for the Lua application programmer to say lfs.someroutine(args) to invoke someroutine in the LFS. Yes, we could hard code this in C but this is only a couple of lines of Lua which would be in your unit routine in the LFS, so the only cludgy reference would be the first node.flash.index()'init'() call and if we allowed the user to define LUA_INIT_STRING in the config then with

#define LUA_INIT_STRING "pcall(function()node.flash.index()'init'()end)"

then they wouldn't even need to do that. It could even handle the special case of lfs.index().

With the switch to LFS, I would also like to see a switch to the core dogma:

Only use C if you can't do it in Lua or there are string performance reasons why it must be done in C.

Just think: change a line of C and you have to reflash the entire firmware. Change one line of Lua and you (or your provisioning subsystem) only needs to replace one SPIFFS file. A mega hassle vs. a few seconds.

For example we've got a C ds18b20 module which is slow and far too limited functionality. My Lua version is both richer and a lot faster. So why use C? The old answer was because C ran from flash so took up less RAM footprint. This argument no longer applies.

As to this LFS init routine, if you do a -p option on the luac.cross then you will see that this doesn't use a table which some C routine parses, the code is a directly emitted Lua function.

And as I have said elsewhere is really that hard to include something like the following in the LFS:init module?

do local ndx = node.flash.index local lfs_t = { __index = function(_, name) local fn = ndx(name) if type(fn) == 'function' then return fn end -- or return nil implied end} getfenv().lfs = setmetatable(lfs_t,lfs_t) end

The RAM footprint is a table with a single entry in it and a LClosure with a single upval. The LFS footprint is a code fragment of 16 instructions and a function of 10 instructions.

You earlier argued that both mine and your uses were "advanced" (probably rightly so), so I think I'll level that same comment back at you here :) Meta tables aren't exactly the most obvious things when one starts out with Lua.

I personally was looking at node.flash.index as a drop-in replacement for dofile (or even require). It's something existing developers are familiar with both concept wise and syntax wise, and it matches how the files/modules get interned into the LFS. Where before you would've mymod = dofile('mymod.lc') you could (assuming suggested change) mymod = node.flash.index('mymod'). To me that would be a nice path into the LFS.

To address the dogma shift, yes, I agree in principle, but we'd need to provide an easy way for users to build the flash.img. Right now Marcel's cloudbuilder is saving everyone from needing their own build environment, which is fabulous from a usability perspective. It does make it hard to start using LFS with this though. Say @marcelstoer, do you think it would be feasible to also have an LFS builder service? A user uploads their .lua files and we spit out a either a PIC flash.img to be node.flash.reload()ed or an lfs.bin to be esptool'd straight onto the device? If I had time and somewhere to host it I could probably put something together (insert bemoaning of lack of time here).

This is my init.lua

if file.rename('flash.img', 'loaded.img') then node.flash.reload('loaded.img') end node.flash.index()('init')()

Then my init.lua in the ROM starts with

local index = node.flash.index and node.flash.index() local function loader_flash(module) local r = index(module) return type(r) == 'function' and r -- or nil otherwise end if index then package.loaders[3] = loader_flash end

@pjsq Maybe you can explain what I'm missing here? Why do we need to do

local index = node.flash.index and node.flash.index()

rather than

local index = node.flash.index

What feature is gained by requiring that function call? Why can't that be hidden inside the node.flash.index function itself?

@jmattsson this and style is because

if the LFS isn't compiled in then node.flash returns nil and so evaluating node.flash.index will throw an exception, causing a panic.

if the LFS is compiled in, but the LFS isn't loaded, then node.flash.index returns nil and so evaluating node.flash.index() will throw an exception, causing a panic. In the case of running from LFS, the node.flash.index and is redundant because this code can assume that LFS is both compiled and loaded, and this makes life a lot simpler.

My preferred init.luaapproach is:

pcall(function() local f=node.flash; return f.index and f.index()('init')() or f.reload('.img') end)

which is guaranteed not to panic. and I do the update reload using my previsioning system.

I chose the Lua route because (a) it is simple; and (b) it leaves it to the individual developer / project to implement their own Lua encapsulation that's right for them. This got the patch out there to evaluate. I don't think that using meta tables is that much of an issue as we can include a default in Lua examples and refer to it in the API reference. Novice Lua programmers can use it as-is and more advanced ones can use it as a starting point to modify.

I will give an analogy: Enduser_setup is written in C, by I regularly see Lua developers post issues "I don't know how to write C modules, but how do I modify it to do XYZ?" If it was in Lua, then they could just tweak the code.

This all being said, changing the API is pretty straight forward, so long as we first agree what we want to change it to. Let's pick up this general point on #2292

jmattsson · 2018-03-21T02:40:07Z

Something's not right here. Default build of this PR, with LFS enabled (still 64k).

$ cat local/lua/wtf.lua 
function wtf(arg)
  print(arg)
  return 1
end

Build, flash, node.flash.reload() the flash.img file, reboot the esp.

NodeMCU 2.1.0 build unspecified powered by Lua 5.1.4 on SDK 2.1.0(116b762)
> wtf=node.flash.index()('wtf')
> =wtf('hi')
> =wtf('hi')
hi
1
>

Why isn't the first call to my wtf function giving me any output or returning a value?

jmattsson · 2018-03-21T03:26:10Z

app/lua/lgc.c

@@ -538,7 +555,7 @@ static void atomic (lua_State *L) {
  size_t udsize;  /* total size of userdata to be finalized */
  /* remark occasional upvalues of (maybe) dead threads */
  remarkupvals(g);
-  /* traverse objects cautch by write barrier and by 'remarkupvals' */
+  /* traverse objects caucht by write barrier and by 'remarkupvals' */


I think you mean "caught" 🤣

A bit naff replacing one spelling error by another !!

And I'll take a look at your previous test case.

@jmattsson, re your previous example, this is a mindfart on your part. 😀 It does exactly what I would expect it to do, and would have the same result as wtf=loadfile('wtf.lua') is this was in the SPIFFS. Do a luac.cross -l -o /dev/null local/lus/wtf.lua to see what this generates:

The first execution executes the following with replaces the global wtf with the closure:

1 [4] CLOSURE 0 0 ; 0x258b2d0 2 [1] SETGLOBAL 0 -1 ; wtf 3 [4] RETURN 0 1

wtf now points to the closed function so the second call executes the following to give you the response you expected.

1 [2] GETGLOBAL 1 -1 ; print 2 [2] MOVE 2 0 3 [2] CALL 1 2 1 4 [3] LOADK 1 -2 ; 1 5 [3] RETURN 1 2

Try replacing wtf.lua by the following and it will do what you expect:

-- function(...) local arg = .... print(arg) return 1 -- end

Umm.. what? I'm not reassigning wtf between the invocations. I'm assigning the global wtf exactly once, then calling it twice. I sure as sundown don't expect my wtf variable/function to be reassigned just because I invoked it (given I don't reassign it in the function). Kindly explain my supposed mindfart further.

No, you are getting caucht (joke) is the subtleties of Lua. If you've got 15 mins, then just Skupe me or ping me an email and I'll Skype you.

As I said, if you put the same file in SPIFFS and did a wtf=loadfile('wtf.lua') then you'd see the same.

As Terry promised me: ./facepalm
Okay, the penny dropped:function wtf() end is of course the same as wtf = function() end, and the initial wtf was the file/module closure itself.

jmattsson · 2018-03-21T03:33:08Z

app/lua/lobject.h

-/* we could define it this way */
-#define setttype(obj, _tt) ( ttype_sig(obj) = add_sig(_tt) )
-#endif // #ifndef LUA_PACK_VALUE
+#define setttype(obj, stt) ((void) (obj)->value, (obj)->tt = (stt))


Hmm? Are you trying to silence some compiler warning here or something?

There are two tt fields, one in the TValue and (in the case of GCObjects) one in the GCO header, and the code should not confuse the two. (Lua 5.3 does this by renaming the latter to tt_.) The ttype(o) and setttype(o,t) macros should only be used on the TValue type.

The (void) (obj)->value guard is compiled but discarded during code optimisation, but it still throws a compile error if these are used on the GCOs, and this allowed me to fix a few cases that had been missed in the eLua changes.

The reason that this is important for LFS is that the TValue form is a 32 bit integer and so testing this doesn't throw a l8ui exception. What I intend to do is to replace the relatively small number of GCO tt tests by a macro form which generates the extra extui instruction and avoids the unaligned exception.

Aaaah, that makes sense. Cool hack. Maybe worth leaving a comment along those lines there for future reference? E.g. // guard against confusion between GCO's and TValue's "tt" field by referencing the "value" member too

In fact I just copied the same hack from somewhere else. I should really back it out or make it a //#define variant because I really only need this when i need to do type checking in macros and I suspect that it does leave the code in in -O0.

jmattsson · 2018-03-21T03:39:43Z

app/lua/lua.c

@@ -468,11 +324,8 @@ int lua_main (int argc, char **argv) {

 void lua_handle_input (bool force)
 {
-  while (gLoad.L && (force || readline (&gLoad)))
-  {
+  if (gLoad.L && (force || readline (&gLoad)))


Whoa, this change makes me nervous. I explicitly changed it to a while to fix the issue of lost commands on the serial. If more than one line buffers up the LVM would fall behind in its input processing.

To be honest, I can't remember why I did this. I will take a look at the history. My first pass through merged in the changes from a vanilla Lua-5.1 variant and I may just have missed this in the merge.

But the whole interactive interface is very flaky anyway. It is really designed for a more POSIX-like environment. The way it determines compilation units is to stack the lies and try:

line 1

line 1 + line 2

line 1 + line 2 + line 3

...

Until the sequence compiles. It then executes the brick and treats the next line as line 1. In the meantime there is no flow control (hardware or XON/XOFF) on the input uart so it is very easy to overflow the 128 byte input FIFO on the UART, with or without this change.

Also the whole of the interactive interface including the input / output redirector is extremely flaky.

However all of this is separate to this change so I will back this one out here.

jmattsson · 2018-03-21T03:47:29Z

app/platform/platform.c

@@ -879,7 +879,7 @@ uint32_t platform_s_flash_write( const void *from, uint32_t toaddr, uint32_t siz
  if(SPI_FLASH_RESULT_OK == r)
    return size;
  else{
-    NODE_ERR( "ERROR in flash_write: r=%d at %08X\n", ( int )r, ( unsigned )toaddr);
+    NODE_ERR( "ERROR in flash_write: r=%d at %p\n", ( int )r, ( unsigned )toaddr);


We actually have %p support in the printf? I thought we didn't (but I could be getting myself confused with one of the other platforms I work on).

It's fairly recent, but still already there: #2062

jmattsson · 2018-03-21T03:49:52Z

app/platform/platform.c

+uint32_t platform_flash_mapped2phys (uint32_t mapped_addr)
+{
+ uint32_t meg = flash_map_meg_offset();
+ return (meg&1) ? -1 : mapped_addr - INTERNAL_FLASH_MAPPED_ADDRESS + meg ;


(meg <= 0) a bit clearer as to what the checks here (and a few lines further down) do?

The platform API already had the mapping one way. I just added the inverse map using the same checks.

jmattsson · 2018-03-21T03:50:40Z

app/user/user_exceptions.c

    }
-    asm ("break 1, 1");
+     asm ("break 1, 1");


Surplus whitespace, or intentional?

I'd lose the while(1) {} and replace it by a while around the break. But a job for another day.

The break will instantly reboot the ESP if the debugger exception handler is not installed If is then you will enter the debugger.

jmattsson · 2018-03-22T03:54:34Z

app/lua/lflash.c

+/*
+ * Flash memory is a fixed memory addressable block that is serially allocated by the
+ * luac build process and the out image can be downloaded into SPIFSS and loaded into
+ * flash with a node.flash.load() command. See luac_cross/lflashimg.c for the build


Nitpick, the command is currently node.flash.reload(), and subject to iteration :)

Yup, but as we discussed these two might be better stripped back. Whatever ... will pick this up on the sweep.

jmattsson · 2018-03-22T04:36:52Z

Here's one way of solving the build-fails-if-no-local/lua-files-exist issue:

diff --git i/tools/Makefile w/tools/Makefile
index 608e985..745bda5 100644
--- i/tools/Makefile
+++ w/tools/Makefile
@@ -69,8 +69,13 @@ spiffsscript: remove-image LFSimage spiffsimg/spiffsimg
        $(foreach sz, $(FLASHSIZE), spiffsimg/spiffsimg -f ../bin/0x%x-$(sz).img  $(FLASH_SW) $(sz) -U $(FLASH_FS_LOC) -r ./spiffsimg/spiffs.lst -d; )
        @$(foreach sz, $(FLASHSIZE), if [ -r ../bin/spiffs-$(sz).dat ]; then echo Built $$(cat ../bin/spiffs-$(sz).dat)-$(sz).bin; fi; )
        
-LFSimage: $(LUASOURCE)*.lua
-       ../luac.cross -f -o $(FSSOURCE)flash.img $(LUASOURCE)*.lua
+LFSSOURCES:=$(wildcard $(LUASOURCE)/*.lua)
+ifneq ($(LFSSOURCES),)
+LFSimage: $(LFSSOURCES)
+       ../luac.cross -f -o $(FSSOURCE)flash.img $(LFSSOURCES)
+else
+LFSimage: ;
+endif
 
 remove-image:
        $(foreach sz, $(FLASHSIZE), if [ -r ../bin/spiffs-$(sz).dat ]; then rm -f ../bin/$$(cat ../bin/spiffs-$(sz).dat)-$(sz).bin; fi; )

marcelstoer · 2018-04-02T11:26:22Z

Espressif claims they squeezed 17KB extra memory out of their RTOS SDK 2.0...niice. Regardless of what we do here it'd be nice if they gave the NON-OS SDK just as much love.

TerryE · 2018-04-02T11:54:41Z

Yup, but when you go through the details in their documentation, then you see that the way that they do this is to drop the instruction cache from 32K to 16K and use the extra 16K freed by doing this as heap.
Given that most of our code execution is from flash and this relies on the instruction cache to get reasonable runtime, this isn't a patch that I'd want on my ESPs. I suspect that its pretty much a desperation measure to make the ESP8266 RTOS version usable.

TerryE · 2018-04-03T06:21:37Z

So I have been working through the correction list for updates to this patch. The update list is far larger than the majority of other PRs !! So what I am proposing to do is to rebasline the patch so that incorporates two commits:

The first of these is a roll-up of the four initial commits on the Feb PR.
The second includes all of the review additions, so that reviewers can focus on this second commit.

This might force me to close and reopen this PR, but let's see.

Note that this will also implement the eLua bugifix and the rest of #2333 so that as well as being able to move out all of the program and string data into LFS, the remaining RAM-base TValue-related resources will be 75% of the footprint (if you define LUA_PACK_TVALUES_ENABLE in your build).

@jmattsson, I am surprised that you haven't picked up and commented on this little gem! 😄

jmattsson · 2018-04-03T06:35:27Z

@TerryE Why? I iz veri bizi... :(
Looks like an easy win (provided there are no gremlins in the corner cases), so go for it.

As for rebaselining the patch, I'd recommend not doing it. Github does a good job of remembering which version comments were applied to, so I think you might end up complicating things while trying to simplify them. Just push more commits here as-is; we can always tidy up if needed.

(And geez, stop taking all the fruit at once - surely we'll want to be able to find these juicy bits in the future too and feel all good about further freeing up memory! ;-P )

marcelstoer · 2018-04-12T15:25:47Z

The Travis CI build failed

Did you look into that already?

TerryE · 2018-04-12T19:39:53Z

Did you look into that already?

Is the Pope catholic? 😉

marcelstoer · 2018-05-22T15:34:20Z

app/include/user_config.h

@@ -1,37 +1,206 @@
 #ifndef __USER_CONFIG_H__
 #define __USER_CONFIG_H__



With all the relevant documentation now inside this file could you maybe add a note with a link to it to the first paragraph at https://nodemcu.readthedocs.io/en/latest/en/build/#build-options?

Will do 😄

TerryE · 2018-05-27T16:36:55Z

I've just found a bug when porting my latest telnet version to LFS. (It works fine on the latest master.) This LFS version fails on node.input() handling. Any node.input() causes the readline processor to go into a loop; only terminating on WD timeout. Entering this minimal command at the interactive prompt shows this:

node.task.post(function()node.input "='hello'"; end)

Sorry guys.

TerryE · 2018-05-29T11:41:45Z

I've just found a bug when porting my latest telnet version to LFS. (It works fine on the latest master.) This LFS version fails on node.input() handling. Any node.input() causes the readline processor to go into a loop; only terminating on WD timeout. Entering this minimal command at the interactive prompt shows this:

node.task.post(function()node.input "='hello'"; end)

Sorry guys.

PS. Now fixed.

TerryE · 2018-06-12T22:19:19Z

As some of you might have noticed, I may seem to have been side-tracked working on getting LFS-optimised versions of my provisioning system, together with the telnet and ftp servers. But there was some rationale to this: I wanted to hammer LFS myself in a real development context before pushing the "final" PR to dev. It has been 100% solid. Ditto the use of 12 byte TValues, so I suggest we we also move to 12 byte TValues on this PR as default for floating point builds.

Another observation: @jmattsson was absolutely spot-on in requesting an "absolute" option for LFS images. I use a script file to rebuild my LFS and reimage it directly to a test ESP module. The whole cycle takes less than 10 seconds, so I find it easier just to develop directly in LFS and not even bother with moving in-test modules onto SPIFFS, because it is simpler and faster to work directly with all modules in LFS. I do use the relocatable version but only for production luaOTA applications.

jmd13391 · 2018-06-17T10:20:16Z

Is there a rough ETA for LFS to land in DEV?

TerryE · 2018-06-17T11:18:23Z

I've got about a day's work and it's next on my TODO list. 😚

TerryE · 2018-06-20T18:00:42Z

Sorry for the delay guys. I am trying to track down a weird bug to do with the new method of SPIFFS and LFS inter-operating though linker-based static allocation.

If LFS is disabled, then everything is OK.
If LFS is enabled but with a zero sized LFS store then ditto.
If LF has any store allocated then something else (?SPIFFS?) is stomping over the top end of flash (The LFS code is never called).

Lot's of head scratching. Just waiting for that "oh shit, that's it" moment.

jmattsson · 2018-06-21T05:35:35Z

@TerryE forget to move the flash_used_end symbol in the linker file perhaps?

Try running a delta against the branch I linked above some time ago. I had both LFS & SPIFFS going happily with static allocation.

TerryE · 2018-06-21T06:00:53Z

I will have my own palm slap when I find it. Nothing so obvious. It's a sod. If I write then read verify the flash, then it's OK. But if I run the firmware and read it gets stomped. I've added a temp #ifdef code so that I can set the LFS to 0 len and that works but 64K and it runs the same exec path (basically a no-op if the flash signature isn't correct) and the stomp occurs. And very early. I'll find it today.

TerryE · 2018-06-21T18:07:35Z

OK, definitely a head scratch here.

I am using 32mbit (4Mb) Wemos parts for my test harness. If I explicitly initialise the Flash header to 4Mb size using the -fs 32m option in my esptool.py flash download, then everything seems to work fine.
If I don't then a flash sector near the end of my image gets erased, stomping some code. I've put in some diagnostics and this occurs before nodemcu_init() is posted. More investigation needed. Sorry guys.

PS. This appears to be @devsaurus Arnim's patch #1968 to the user_start_trampoline() to support 2.2.0. If the size byte isn't set, then the flash_end_addr is in the application binary. "If the size byte is wrong, then we'll end up fixing up the init data again on the next boot" doesn't work if you've already stomped the binary. Impossible to debug since this is trampoline code running before the irom has been mapped. Uaarrhh.

jmattsson · 2018-06-22T00:04:42Z

...if you don't, then the SDK uses whatever is in there to look for its config data, and if there's a bad checksum writes new config blocks there (last three sectors, as determined by the flash header). We used to catch an incorrect flash size byte before we got into the SDK to avoid just such a problem, so maybe that's gotten broken somehow? You haven't accidentally stopped the early init stuff from being called, have you? (e.g. by dropping the custom entry point setting)
This is the user_start_trampoline(), normally set as the entrypoint by ld/nodemcu.ld.

TerryE · 2018-06-22T00:54:07Z

@jmattsson Johny, I've just noticed that we cross posted, having come to the same conclusion. See my PS to my previous comment. But thanks for the input :)

jmattsson · 2018-06-22T02:13:20Z

Right, so we need to move the fixing-up of the size-byte into the trampoline too to ensure we have everything in order before the SDK can run? Not happy about the extra code in IRAM in that case, but maybe it's necessary.

TerryE · 2018-06-22T15:18:42Z

OK, having thought about this one, this isn't an LFS issue, but more one that larger images such as LFS builds can excite. Therefore I have raised #2407 to keep track of this but I am assuming that any LFS users will avoid this issue by the steps listed in #2407.

TerryE · 2018-06-22T22:08:51Z

OK, take it apart guys. Please note #2407. The main changes are the incorporation on all of the review feedback, the merge with dev to resolve parallel changes to node.c, including:

Johny's static allocation of the LFS region.
LUA_INIT_STRING now works but be careful to avoid an infinite error loop, so start by tweaking the example supplied in user_config.h
lua.cexposed the internal of the lua_Load structure and this was messily manipulated in node.c and coap.c. This is now hidden and an access routine lua_put_line() is provided to provide encapsulated access.

Also note that because the LFS region is inside the 0x10000.bin file, you can't flash the bin file in the same esptool.py command as this has region overlap detection. The workaround is a variant of this:

# Pick the LFS region address from the ELF image
export LFS_BASE=$(xtensa-lx106-elf-objdump -t ../app/.output/eagle/debug/image/eagle.app.v6.out |grep flash_region_base| cut -b 4-8)
# Compile the LFS sources into the correct absolute  bin LFS image 
luac.cross -a 0x402$LFS_BASE -o $BIN/lfs-0x$LFS_BASE.img $LFS_SOURCE/*.lua
# Flash the firmware if needed.  Note the no_reset flag
esptool --port /dev/ttyUSB0 --baud 460800 --after no_reset write_flash -fm dio  0x00000 $BIN/0x00000.bin 0x10000 $BIN/0x10000.bin
# Now flash the LFS (and optionally a new SPIFFS)
esptool --port /dev/ttyUSB0 --baud 460800 write_flash -fm dio 0x$LFS_BASE  $BIN/lfs-0x$LFS_BASE.img

This takes about 7sec to reimage the ESP firmware and LFS with the new Espressif esptool.py

TerryE · 2018-06-25T22:16:35Z

@cwrseck @devsaurus @djphoenix @dnc40085 @dtran123 @drawkula @georeb
@HHHartmann @jmattsson @joysfera @marcelstoer @NicolSpies @nwf @pjsg

You have all commented on / reviewed this LFS development in the past. I would very much like to merge this PR into dev, but I need at least one independent reviewer to take it for a ride to make sure that I haven't made any serious cockups in this latest set of changes following the previous feedback. So can I request that if you can find the time, then can you please try it out and post back here. Thanks.

We still need to add cloud-builder support, but even so it is going to be a lot more useful having the base PR in dev rather than my fork.

dnc40085 · 2018-06-26T03:11:11Z

Aside from the error Makefile:75: *** missing separator. Stop. the current version builds and runs my application without issue. 👍

jmattsson · 2018-06-28T08:12:48Z

I was hoping to have had a chance to test out the latest at work this week, but it's looking increasingly unlikely that I'll find time. I was pretty happy overall with it at the last pass, and I've had a quick look through the latest commits and don't see anything that makes be go "eep!" :)
Don't let my absence hold this back any longer, my feeling is that it's good enough to get into dev by now. We can always continue to fix & tweak as necessary.

devsaurus · 2018-06-28T08:47:47Z

I'll try to test drive the latest commits tonight.

TerryE · 2018-06-28T09:33:20Z

We've got other bits to add once it's in dev.

For example:

I am very inclined to add the compression support, and a lua_example of how to wrap up a GET request to pull it from a server.
You do need to add active flow control as per my FTP example, as the server can overrun an ESP without this on a fast link.
I probably also need a lua_example for adding fall-back code in SPIFFS, so that the ESP can recover if there is a powerfail during recovery. Incidentally you don't need to download any files to do this: you can just write out string.dump(LFS.someCode)to a SPIFFS LC file.
I also want to add the ftpserver and telnet examples.

devsaurus · 2018-06-28T21:45:46Z

Looks good 👍

TerryE · 2018-06-28T22:26:29Z

OK, unless anyone shouts, I will do the merge in ~12 hrs time.

I also need to do the script changes so that Marcel can add support for LFS in cloud builder.

TerryE added 4 commits March 17, 2018 00:43

Turn of x bit on some non-executable source files

3d3eebf

Move luac.cross build into standard make hierarchy

4141e69

Tweaks to the Remote GDB interface to make it usable

e00d927

Alpha working wersion for third party evaluation

4ae52c2

TerryE requested review from jmattsson, pjsg and devsaurus March 17, 2018 01:08

pjsg reviewed Mar 17, 2018

View reviewed changes

jmattsson reviewed Mar 20, 2018

View reviewed changes

jmattsson reviewed Mar 21, 2018

View reviewed changes

jmattsson reviewed Mar 22, 2018

View reviewed changes

TerryE and others added 2 commits April 19, 2018 16:27

LFS patch updates following review

88bd9e0

Merge branch 'dev' into dev-LFS

6db7414

marcelstoer reviewed May 22, 2018

View reviewed changes

TerryE added 2 commits June 22, 2018 22:29

LFS patch updates following review II and testing

4f21224

merge current dev to resolve update conflcts in node.c

2ab061f

Fix Travis build failure

17ae6e2

Fix makefile whitespace as per dnc40085 review

a3c2f48

TerryE merged commit 27a83fe into nodemcu:dev Jun 29, 2018

TerryE deleted the dev-LFS branch June 29, 2018 11:21

marcelstoer added this to the next-release milestone Jun 29, 2018

devsaurus mentioned this pull request Nov 16, 2018

Building with Bloom module fails due to undefined SHA256_xxx functions #2556

Closed

		@@ -1,37 +1,206 @@
		#ifndef __USER_CONFIG_H__
		#define __USER_CONFIG_H__

LFS evaluation version -- second release #2301

LFS evaluation version -- second release #2301

Conversation

TerryE commented Mar 17, 2018

This is a BIG patch

devsaurus commented Mar 17, 2018

pjsg Mar 17, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TerryE Mar 18, 2018 • edited Loading

Choose a reason for hiding this comment

TerryE Mar 18, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmattsson commented Mar 20, 2018

jmattsson commented Mar 20, 2018

jmattsson commented Mar 20, 2018

jmattsson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TerryE Mar 21, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TerryE Mar 20, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmattsson Mar 20, 2018 • edited Loading

Choose a reason for hiding this comment

TerryE Mar 20, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TerryE Mar 21, 2018 • edited Loading

Choose a reason for hiding this comment

jmattsson commented Mar 21, 2018

Choose a reason for hiding this comment

TerryE Mar 21, 2018 • edited Loading

Choose a reason for hiding this comment

TerryE Mar 21, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TerryE Mar 21, 2018 • edited Loading

Choose a reason for hiding this comment

pjsg Mar 17, 2018 •

edited

Loading

TerryE Mar 18, 2018 •

edited

Loading

TerryE Mar 18, 2018 •

edited

Loading

TerryE Mar 21, 2018 •

edited

Loading

TerryE Mar 20, 2018 •

edited

Loading

jmattsson Mar 20, 2018 •

edited

Loading

TerryE Mar 20, 2018 •

edited

Loading

TerryE Mar 21, 2018 •

edited

Loading

TerryE Mar 21, 2018 •

edited

Loading

TerryE Mar 21, 2018 •

edited

Loading

TerryE Mar 21, 2018 •

edited

Loading

TerryE Mar 21, 2018 •

edited

Loading