Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to SDK 3.0 -- update 1 #2689

Closed
TerryE opened this issue Mar 4, 2019 · 63 comments
Closed

Upgrade to SDK 3.0 -- update 1 #2689

TerryE opened this issue Mar 4, 2019 · 63 comments
Assignees

Comments

@TerryE
Copy link
Collaborator

TerryE commented Mar 4, 2019

Key points from of previous discussions

  • Espressif currently state that Espressif SDK 3.0 will be the final release of the non-OS variant of the SDK, and therefore further planned NodeMCU releases will be based on this.
  • As well various bugfixes and enhancements SDK 3.0 has 2 major changes that impact on NodeMCU firmware.
    • It now integrates a version of @jmattsson Johny's unaligned exception handler which enables a lot of the SDK constant data to be moved in flash mapped memory and thus freeing up some 12Kb extra RAM for application use. Because of this, I have now decommissioned our original version.
    • Note that 3.0 also allows the ICACHE size to reduced from 32Kb to 16Kb in theory making another 16Kb RAM available for application use. However, in practice such a small cache only works effectively for small footprint firmware applications, and NodeMCU performs terribly with option enabled so we will not support builds with a reduced 16 Kb cache.
    • It exposes a partition table (PT) interface similar to that implemented for the ESP32, and this release of the firmware will make full use of this partition table.
  • We will have dedicated LFS and SPIFFS partitions that are configurable within the PT. NodeMCU keeps a static copy of the PT at a fixed flash offset 0x10000, enabling it to be easily accessed and updated during module provisioning.
  • The firmware addresses these partitions through the PT.
  • Our standard PT configuration moves the 3 small SDK System Partitions (RF_CAL, PHY_DATA and SYSTEM_PARAMETER) into unused flash between the 0x0000 segments and 0x10000 segment)
  • The associated PR covers the above scope. The following are out of scope but will be covered by subsequent PRs, so discuss of these is currently outside the scope of this issue.
    • Move the TLS certificate store into its own PT partition.
    • Implement Firmware OTA upgrade.
  • The SDK 3.0 version will continue to support the GDB stub. This now seems to work fine but there might be some interoperability issues because our version of the non-aligned exception handler did do exception handler chaining.
  • The SDK 3.0 user_pre_init() entry-point has the ICACHE_FLASH_ATTR qualifier -- that is unlike the SDK 2.0 user_trampoline code, ICACHE has already been enabled.
  • The SDK 3.0 firmware drops support for flash_detect_size_byte() as this can now be easily set during configuration using esptool.py or any wrapper.

Superseded issues

  • Upgrade to SDK 3.0 #2467. (Previous SDK 3.0 issue) I've agreed with @nwf Nathaniel that it will probably be easier to committers and members to follow discussions is that we wrap up the old issue and restart with a clean baseline. Notwithstanding this, this is still worth a browse for background discussion.
@TerryE TerryE self-assigned this Mar 4, 2019
@TerryE
Copy link
Collaborator Author

TerryE commented Mar 6, 2019

One of the challenges of this patch is that it touches so much of the project -- for example we will need to tweak our builds and change documentation as well as the core functional changes. I will raise a PR which focuses on the core components so that other committers and testers can evaluate this. If any of these contributors want to add supplemental commits, then just ask and I will give you push rights to my fork.

@dtran123
Copy link

dtran123 commented Mar 6, 2019

I am ready to offer some of my time to test this long awaited feature as I believe it will be a game changer as far as freeing the RAM needed for some important usecases. In my case, it will be easier when the changes are submitted in dev branch for me to test it.

@TerryE
Copy link
Collaborator Author

TerryE commented Mar 6, 2019

In my case, it will be easier when the changes are submitted in dev branch for me to test it.

Yup but quite a few members do there builds from dev so we need to make sure than any merge into dev is fairly complete and consistent.

@HHHartmann
Copy link
Member

I and maybe others could do a build or two.

@galjonsfigur
Copy link
Member

galjonsfigur commented Mar 13, 2019

I was trying to test this patch on ESP8285 but I couldn't get it to work. I changed FLASH_4M to FLASH_1M and tried to adjust SFIFFS parameters. After some attempts and error messages about wrong partition sizes or flash read errors(when using wrong flash modes) from ESP8285 I set:

#define FLASH_1M
...
#define LUA_FLASH_STORE 0x0A000
...
#define SPIFFS_FIXED_LOCATION        0x0B3000
#define SPIFFS_MAX_FILESYSTEM_SIZE    0x20000

But the only thing that comes from UART when I reset the board (Wemos D1 mini) is:

boot mode:(3,7)

ets_main.c

I tested various flash modes and I don't think that's the reason for it. I'm quite sure that those parameters are wrong.

Because I enabled SSL support firmware is quite big

   text	   data	    bss	    dec	    hex	filename
 649728	   2740	  30768	 683236	  a6ce4	eagle.app.v6.out

but should fit on 1MB flash. On other ESP8266 with 4MB of flash everything works. If anyone has 1MB ESP8266 or ESP8285 and tested new SDK on it I would be really glad to see how.

@TerryE
Copy link
Collaborator Author

TerryE commented Mar 13, 2019

I'll do the proper PR today or tomorrow. Sorry guys, but I've been a bit poorly, and this has hit my ability to do useful work. The committers know the backstory. 😒

What we are doing is to remove the SPIFFS and LFS defines in user_config.h as these can now be set directly in the PT. This being said, if you don't update the PT from the compiled in defaults, the start-up assumes sensible defaults.

@TerryE TerryE mentioned this issue Mar 13, 2019
4 tasks
@TerryE
Copy link
Collaborator Author

TerryE commented Apr 5, 2019

This 1st cut of the SDK 3.0 release has now been released into dev. Contributors please raise any issues if identified. Thank-you.

@dtran123
Copy link

dtran123 commented Apr 7, 2019

Thanks @TerryE so much ! Looking forward to play with the latest dev build! So excited to test it out and see how it solves some of my current pain points & usecases. Will report any issues (if any)...

@TerryE
Copy link
Collaborator Author

TerryE commented Apr 7, 2019

Remember to use the -fs switch on esptool.py to set the flash size when you flash the firmware, then you should use the tools/nodemcu-partition.py utility to configure the PT and download any LFS and SPIFFS images.

@dtran123
Copy link

dtran123 commented Apr 7, 2019

I am unable to build from dev branch. Tried both the docker and the online https://nodemcu-build.com/ methods. Am I doing something wrong ?

The docker method resulted in:

  inflating: ESP8266_NONOS_SDK-3.0/lib/libwps.a
  inflating: ESP8266_NONOS_SDK-3.0/lib/readme.md
  inflating: ESP8266_NONOS_SDK-3.0/lib/strip_libc_funcs.txt
  inflating: ESP8266_NONOS_SDK-3.0/lib/strip_libgcc_funcs.txt
PRUNE libmain.a libc.a
make: xtensa-lx106-elf-ar: Command not found
make: *** [/opt/nodemcu-firmware/sdk/.pruned-3.0] Error 127
make: Leaving directory `/opt/nodemcu-firmware'

The online build method resulted in:

Sorry, your NodeMCU custom build failed. The maintainer of this site was notified but you may be able to fix it yourself.
Take a look at the FAQ https://nodemcu-build.com/faq.php#build-failure and https://nodemcu-build.com/faq.php#boot-failure.

@TerryE
Copy link
Collaborator Author

TerryE commented Apr 7, 2019

@dtran123 quite possibly not. The make works in a linux environment, but adding the PT has changed what is configured through user_config.h and also added an updated toolchain. It looks like these changes have broken the end-to-end integration with the cloud builder variant builds. I will look into the these.

This sort of thing can happen after such a change. That's why we do the push immediately after a dev->master drop so the fall back is to use the master version whilst we fix this.

@TerryE
Copy link
Collaborator Author

TerryE commented Apr 7, 2019

Tried the online https://nodemcu-build.com/ method

@dtran123 what was the Travis build number for the failed build (something like #25nnnn) so I can see detailed log and see why it failed? Thanks

@dtran123
Copy link

dtran123 commented Apr 8, 2019

How do I check the Travis build number ? The "failed build" email doesn't share any build number.

@TerryE
Copy link
Collaborator Author

TerryE commented Apr 8, 2019

The date timestamp of the email would help. Thanks

@dtran123
Copy link

dtran123 commented Apr 8, 2019

I tried 3 times before abandoning...
The 3 build failed at
Apr. 7 at 12:48 a.m.
Apr. 7 at 12:59 a.m.
Apr. 7 at 1:05 a.m.
As per my yahoo account. Note that I am in EST timezone.

I guess the nodemcu automated build system should probably add the Travis build number to help future troubleshooting.

@marcelstoer
Copy link
Member

The cloud builder is now handling this correctly. Sorry the changes were merged before the dev-friendly tools were ready for them.

@dtran123
Copy link

dtran123 commented Apr 9, 2019

Thanks. It works now. Will be testing this new build in the next few days...
Nice to see 58K instead of the usual 43K of heap.

NodeMCU custom build by frightanic.com
branch: dev
commit: 5a6992c
SSL: true
modules: adc,file,gpio,net,node,rtcmem,sntp,tmr,uart,websocket,wifi,tls
build created on 2019-04-09 22:45
powered by Lua 5.1.4 on SDK 3.0.0(d49923c)
lua: cannot open init.lua

print(uart.setup(0, 115200, 8, 0, 1, 1 ))
115200

Communication with MCU...
Got answer! AutoDetect firmware...

NodeMCU firmware detected.
=node.heap()
58288

@TerryE
Copy link
Collaborator Author

TerryE commented Apr 10, 2019

It is even more with LFS including dummy_strings.lua.

@TerryE
Copy link
Collaborator Author

TerryE commented Apr 10, 2019

I've gone through this with Marcel offline on a 1-1 and we have decided that we should not mandate the use of the partition tool. To this end, I will be raising a follow up PR to add the functions node.getpartitiontable() and node.setpartitiontable() so the (primarily Windows) developers have the option of imaging the ESP with a standard cloud builder image and then interactively calling node.setpartitiontable() on first boot to set up the partition table. Note that initially only entries for lfs and spiffs will be supported.

@dtran123
Copy link

I welcome this idea very much. This would reduce the barrier to entry. Also, in my opinion, LFS should be enabled by default with a default size that covers the majority of cases (e.g. 64K or 128K). This way, most people can start right away using it...increasing adoption rate. And if the size changes, maybe a node.lfs_size() something like that could be available to change upon reboot or on the fly. Just a thought. Unless this could be part of node.setpartitiontable().

@HHHartmann
Copy link
Member

HHHartmann commented Apr 11, 2019

I would also welcome this.
How would the bootstrap look like? There should be at least a SPIFFS partition to store files to run the configuration. Or should node.setpartitiontable() be called via serial?
I also would prefer to have an LFS partition initially.

After node.setpartitiontable() I would have to reformat SPIFFS and/or reflash LFS. So in these cases I would need serial communication anyways.

Sorry for asking but I couldn't find it in the code or documentation:Is there a way (already) to configure the PT on build time (in docker) and get it flashed with existing tools (which are not part of the 3.0 approach)?

@TerryE
Copy link
Collaborator Author

TerryE commented Apr 11, 2019

Is there a way (already) to configure the PT on build time (in docker) and get it flashed with existing tools (which are not part of the 3.0 approach)?

The PT only exists on SDK 3.0 builds so no existing tool has full support for PTs. You can use esptool.py to do the actual writing to flash but this doesn't understand the flash format so you'd need to write a wrapper around esptool to do this (or alternatively modify the 0x00000.bin + 0x10000.bin to combined.bin conversion). and if you look at the source of the partition tool, all it is is an esptool wrapper:

./tools/nodemcu-partition.py -h
nodemcu-partition.py V1.0
usage: nodemcu-partition.py  [-h] [--port PORT] [--baud BAUD] [--lfs-addr LA] [--lfs-size LS]
              [--lfs-file LF] [--spiffs-addr SA] [--spiffs-size SS]
              [--spiffs-file SF]

nodemcu-partition.py V1.0 - ESP8266 NodeMCU Loader Utility

optional arguments:
  -h, --help               show this help message and exit
  --port PORT, -p PORT     Serial port device
  --baud BAUD, -b BAUD     Serial port baud rate used when flashing/reading
  --lfs-addr LA, -la LA    (Overwrite) start address of LFS partition
  --lfs-size LS, -ls LS    (Overwrite) length of LFS partition
  --lfs-file LF, -lf LF    LFS image file
  --spiffs-addr SA, -sa SA (Overwrite) start address of SPIFFS partition
  --spiffs-size SS, -ss SS (Overwrite) length of SPIFFS partition
  --spiffs-file SF, -sf SF SPIFFS image file

There is no reason why we couldn't move its use into a make flash target.

I also would prefer to have an LFS partition initially.

If you look at user_main.c:setup_partition_table(), then you will see that if the partition table is at the build default, then by default a 64Kb LFS partition is allocated immediately following the irom0 segment, and the remainder of the flash is a SPIFFS partition. Hence the chip will boot with some sensible defaults.

Should node.setpartitiontable() be called via serial?

Either you will be using a provisioning tool such as esptool.pyand nodemcu-partition.py and this last really makes imaging LFS and SPIFFS partitions easy or you won't in which case you pretty much have to work initially at the interactive prompt since the LFS and SPIFFS partitions will be uninitialised. In this second case pretty much your first step interactively should just be to enter a set partition table call manually if you want a different default and let the routine reboot the firmware to load the new firmware before trying to initialise the LFS and SPIFFS.

The 4Kb page starting at 0x10000 is special in that it is treated almost as a separate partition in its own right. the build and nodemcu partition create the PT at its head but the rest is set to 0xFF, and effectively blank from a flash NAND perspective.

  • We can adopt the erase-and-overwrite model which is simple but has a 10 mSec or so vulnerability to powerfail which would require the flash to be reimaged via the serial port.
  • Or we could treat this as a write-once message pad for boot-to-boot system configuration. Each message would word aligned and have a 2 byte header: a size byte and a type byte:
    • Type 0x00. Deleted
    • Type 0xff. Unused (acts as EOF)
    • Type 0x01. Partition table
    • Type 0x02-0xfe. TBD, but could include commands like firmware reflash.

With this second format the node.setpartition() function would not erase the PT page at all but rather simply add a new version of the PT and then mark the old one as deleted before rebooting the ESP. Both of these operation would work with a simple flash write because of the NAND flash rules. OK we would only have room for some 60-100 reconfiguration messages before we have to erase the page for GC, but most live systems will never need this number, so the risk is minimal.

@joysfera
Copy link
Contributor

FYI, testing the dev branch with SDK 3.0: My application running on the new firmware is having difficulties connecting to my WiFi AP. Never happened on previous SDKs.

When it manages to connect (restarting my AP helped) then I can see about 10 kB of extra free RAM and that's absolutely awesome because my app tends to crash on current master SDK due to lack of RAM rather frequently.

Yet I experience a lot of frequent unexpected restarts even here on SDK 3.0. Not sure what causes them, as there's no message at all before it restarts.

@HHHartmann
Copy link
Member

If I get it right I could modify partition_init_table in user_main.c to get a PT as I need it.
Then combine 0x00000.bin and 0x10000.bin as it is already done in the docker build (and online build service) and then use the nodemcu py flasher GUI tool to flash it (as I do now too).

I rmember having read an idea to use a configuration file to configure the PT at compile time.
can't remember whether it was it a csv or json file.

About the incremental partition table:

  • can the SDK handle this? does it need to read it?
  • can we use two pages? one with current information and the other one then can be erased safely.
    The page would have a marker at the start. FF meaning just being erased, 01 - FE in Use and 00 completely filled-> use other page and erase.
    If you combine both aproaches and have a blindspot of 10 mSec to erase the block after 60 reconfigurations should be quite fine.

@TerryE
Copy link
Collaborator Author

TerryE commented Apr 11, 2019

@joysfera Petr, The SDK 3.0 has moved a lot of previous RAM const data into Flash which is where the extra RAM comes from. I am going to have to hammer this a bit. If you can isolate failures and create any test cases then post them as separate issues. Thanks.

@TerryE
Copy link
Collaborator Author

TerryE commented Apr 11, 2019

@HHHartmann Gregor, don't modify user_main.c. If you want to modify it on a case by case basis, then do it when you combine the 0x00000.bin and 0x10000.bin. Use a Python script for this and use my partition tool as a starting point. If you are manipulating files then you don't even need to call esptool at this stage.

You don't need two pages. The NAND logic rules are quite straightforward. Have a look at the user_main.c code. It scans the flash copy and builds one in RAM. It is this RAM version that is registered with the SDK.

@NicolSpies
Copy link

@TerryE, like previous game changers , the SDK 3.0 Upgrade works great out of the box using the Docker build approach.
I made a small, temporary modification in the user_main.c to set the LFS default size to 128K for testing. With LFS and dummy_strings.lua a heap of 59456 is reported with the lua application running exactly as before the upgrade. 🚀 🥇

@devsaurus
Copy link
Member

neither Arduino core nor NodeMCU is based on the 3.0 release from last August but on a later revision.

Not quite. Current Nodemcu dev pulls the SDK release zip which dates back to August.
Such method can't catch up with Espressif's concept of pushing fixes to a release branch :(

@marcelstoer
Copy link
Member

Oops, sorry. I better keep my mouth shut then. I really start loosing track of things around here. Sorry again.

Such method can't catch up with Espressif's concept of pushing fixes to a release branch

Not good.

@TerryE
Copy link
Collaborator Author

TerryE commented Apr 17, 2019

We've had 4 years of tweaking runtime performance to minimise the runtime overheads of the exception handler (e.g. Phillip's -Os -> -O2 switch and my peephole asm() macros). I suspect that Espressif might not have done this yet. All part of the learning curve.


Not quite. Current Nodemcu dev pulls the SDK release zip which dates back to August. Such method can't catch up with Espressif's concept of pushing fixes to a release branch

Yes and No. Our master Makefile includes a hardwired SDK_FILE_SHA1 and SDK_FILE_VER which by convention are obtained from the Release page URIs, so 029fc23fe87e03c9852de636490b2d7b9e07f01a is the SHA1 of this release

However, Githib makes these ZIP files available for any intermediate commit, for example 4925f83a524342e954a778e3fe9014bc129cc943 is the corresponding SHA1 of the 4925f83 tree which was the latest commit on Dec 27th. Note that the ZIPs root directory for a commit is ESP8266_NONOS_SDK-<SHA1> instead of ESP8266_NONOS_SDK-<Release> as in the case of a release.

There is nothing to stop us evaluating any given (e.g. the latest) SDK 3.0 commit with minor tweaks to our Makefile. The build would still be well determined, all thanks to GitHub. 😄

@TerryE
Copy link
Collaborator Author

TerryE commented Apr 17, 2019

Thinking about this, perhaps we might include a make switch so that the builder can decide whether to select the Release SDK or the latest (as at the Makefile). That way @joysfera Petr et all can compare and evaluate the performance of both.

I also note that Espressif use their master a bit like we use dev; there have been over 100 extra commits post the last release/v3.0.0 branch commit. I have no idea how stable picking one of these would be, but this might also be worth adding as make option.

All of this work on the non-OS SDK sort of goes against the Espressif statement that 3.0.0 is the last SDK release. Doesn't it?

@marcelstoer
Copy link
Member

I just created #2729 to discuss that aspect separately.

@pjsg
Copy link
Member

pjsg commented Apr 19, 2019

I'm trying to get a dev build to actually work. I'm using the docker image to build and it spits out a combined image. I'm following the instructions at https://nodemcu.readthedocs.io/en/dev/flash/ to flash the combined image. I get the following:

system SPI FI size:4, Flash size: 4194304
LFS base: 00090000
LFS size: 00010000
SPIFFS base: 000A0000
SPIFFS size: 00360000
Writing Init Data to 0x0000c000
 0: 00000065 00000000 0000b000
 1: 00000004 0000b000 00001000
 2: 00000005 0000c000 00001000
 3: 00000006 0000d000 00003000
 4: 00000066 00010000 0007c000
 5: 00000067 00090000 00010000
 6: 0000006a 000a0000 00360000
boot not set
ota1 not set
ota2 not set
>> 0xc,0xc000rf_cal[0] !=0x05,is 0x00

I've tried erasing the flash but it doesn't help. I've tried adjusting the

#define SPIFFS_MAX_FILESYSTEM_SIZE        0x80000

but it doesn't seem to have any effect on what the boot prints out. Also, you need to remove the parens around the value as otherwise you get errors in the tools/Makefile -- but that doesn't seem to be relevant.

However, when I use the cloud builder, then I get a dev image that boots. Is the cloud builder using the current version of the docker image marcelstoer/nodemcu-build? @marcelstoer

Puzzled....

@TerryE
Copy link
Collaborator Author

TerryE commented Apr 19, 2019

 0: Eagle ROM 00000065 00000000 0000b000
 1: RF Cal    00000004 0000b000 00001000
 2: Phys Data 00000005 0000c000 00001000  (Init.data has been written here)
 3: Sys Param 00000006 0000d000 00003000
 4: IROM0     00000066 00010000 0007c000
 5: LFS       00000067 00090000 00010000   (Default 64K)
 6: SPIFFS    0000006a 000a0000 00360000  (Use rest of Flash)

All looks normal, yet are are getting the 0xc000 rf_cal[0] !=0x05,is 0x00 error which is when the init.data hasn't been written. Hummmn. Would need to add a few os_printf statements into user_main to work out what is going on. I've got an updated version coming out tomorrow.

The whole SPIFFS image make needs reworking.

@pjsg
Copy link
Member

pjsg commented Apr 19, 2019

I figured it out. For some reason I was setting DEBUG=1 in my environment before running the build. I was a bit suspicious when I realized that my build was significantly bigger than the build that the cloud builder produced.

Why this is a problem is left as an exercise for another day.

@NicolSpies
Copy link

NicolSpies commented Apr 20, 2019

@TerryE , @marcelstoer , under SDK 3.0, the espconn_secure_send bug where it is only possible to send once in a secure connection, is back.

This bug is detailed in espressif/ESP8266_NONOS_SDK#10.

This bug manifests when http.get is used to send messages to the Telegram API where only the first message is successfully delivered and subsequent message delivery fails.

This was verified by rolling back to the last commit b6cd2c3e dated 5 April before the SDK3.0 commit 9a471079 of 5 April. Using the pre SDK3.0 commit, the first and subsequent messages are delivered without failure.

@TerryE
Copy link
Collaborator Author

TerryE commented Apr 22, 2019

I am having fun with this SDK!! I've got a recent test build where the boot sequence (with some diagnostics) goes:

 ets Jan  8 2013,rst cause:2, boot mode:(3,7)

load 0x40100000, len 30932, room 16 
...
boot not set
ota1 not set
ota2 not set
Partitions successfully registered
>> 0xc,0xc  <crap due to a baudrate change><then back to readable at 74880 baud) 

NodeMCU 2.2.0.0 build unspecified powered by Lua 5.1.4 on SDK 3.0.0(d49923c)
lua: cannot open init.lua
>

but the Lua RTS is set to output at 115200 not 73880. WTF??? Then I come across an issue on the Arduino forum where the poster claims that the new SDK seems to dropping the clock frequency by ⅔ under some circumstances. So I try setting the miniterm baudrate to 49800 (⅔ × 115200) and reboot. Now the <crap due to a baudrate change> is readable and everything else scrambled from baudrate mismatch:

rf cal sector: 11
freq trace enable 0
rf[112] : 00
rf[113] : ff
rf[114] : ff

SDK ver: 3.0.0(d49923c) compiled @ Aug 22 2018 13:50:05
phy ver: 1136_0, pp ver: 10.2

So somewhere in the SDK call_user_start_local() code it is indeterminately resetting the clock frequency 😒

PS

That's on a Wemos D1 mini pro where the serial-to-usb chip supports this clock speed. The same code on the Wemos D1 mini drops the USB serial causing miniterm to crash out because its (older) serial-to-usb chip doesn't.

@TerryE
Copy link
Collaborator Author

TerryE commented Apr 22, 2019

Uuuaaaarrrh! Found it. Amongst the rewrite of the user_pre_init() I made a small tidy up one how I declared the init_data (moving it out of IRAM and into flash):

extern const uint32_t init_data[], init_data_end[];
#define INIT_DATA_SIZE (init_data_end - init_data)
__asm__(
  ".align 4\n"
  "init_data: .incbin \"" ESP_INIT_DATA_DEFAULT "\"\n"
  "init_data_end:\n"
);

The error is that INIT_DATA_SIZE is in units of sizeof(uint32_t) and so the code was only initialising the first 32 bytes of the init_data and not the full 128 bytes, leaving the last 96 bytes as 0xFF. For some reason internal to the processing of the init data, these incorrect initialisation parameters caused the SDK to drop the CPU frequency silently. Sigh. Add the extra *sizeof(uint32_t) and problem solved. What a bastard.

@TerryE
Copy link
Collaborator Author

TerryE commented Apr 23, 2019

I am really looking for feedback from users such as @NicolSpies @joysfera @HHHartmann @dtran123 etc. and the committers. I mentioned the nodemcu-partition utility above and I find this syntax really easy to use as I just flash the firmware then use the utility to set the LFS and SPIFFS partitions and preload them if needed, for example:

tools/nodemcu-partition.py -ls 64K -ss 128K -lf bin/lfs.img

So what I want to do is to allow users the option to list and to update the PT through a couple of node library functions to get/set the partition table. To be honest, I will never use this since I have python on my laptop and it's trivial to run the above script, but it has been suggested that Windows users might want to play around with some partition placing/size through the node API.

  • Do I give a function to get the full partition table or just the LFS & SPIFFS partition info?
  • Do we limit the write ability to just the LFS and SPIFSS sizes or do we allow a more general rewrite?

My instinct is to keep this API as simple as possible: read or write the LFS or SPIFFS partition size on that's it. My rational is that you need to know what you are doing to do more and it would be really easy to crater your firmware image. Anyone who needs to do more complex stuff should use the python tool.

Feedback please.

@jmattsson
Copy link
Member

Limited partition access through a node api would seem a sensible, safe, first step. If there's demand we can always make it more powerful (but maybe keep that option in mind when designing the api to make it easy to do so).

@NicolSpies
Copy link

I agree with the limited access using the node API for basic functionality. I like the approach of different ways of access dependent on the level of expertise. As a Windows user I have not used python yet but similar as with Docker, I am willing to venture out and learn to have more control if required. Scalability of the API with a "designed for but not fitted with' approach makes sense.

@HHHartmann
Copy link
Member

First I thought it would be nice to also size the firmware partition. But then there is no way to write it without pyloader if I am not mistaken. So that makes no sense.
I'm curious and would be interested to read the complete PT. But that's not essential and might leave more questions than answers.
Maybe also keep in mind SPIFFS2 and LFS2 partitions in the api design.

@joysfera
Copy link
Contributor

joysfera commented Apr 23, 2019

My feedback: we need to be able to upgrade the NodeMCU firmware from within running Lua application. If it requires access to the full PT (so that one can decide which firmware will boot at the next reboot, or say can resize a partition to upgrade to a larger firmware than originally anticipated) then full PT API would be needed. If, however, there was a complete NodeMCU API support for handling all firmware upgrade needs then a simplified PT API just for FS and LFS might suffice.

@TerryE
Copy link
Collaborator Author

TerryE commented Apr 23, 2019

Thanks for the feedback. Specific comments:

  • I think that the target user group is Windows developers that use the cloud builder to get their images, and they don't want to get into docker or Linux subsystems / VMs. Note that if I restore the ability to chose a preferred default LHS and/or SPIFFS size in the cloud builder options, then rerunning the build and reimaging will be the simplest approach for them. Using the node API interactively would be next best.
  • @HHHartmann, I assume that most of these users use Pyloader.
  • @joysfera, the whole issue of adding full OTA upgrade is a new ball game. A new firmware will typically has a different size and so crater the LHS and SPIFFS partitions unless headroom has been added above the IROM0 partition to allow for expansion, which is of course why you have the option to specify their start addresses.

@joysfera
Copy link
Contributor

@TerryE it is a new ball game but still related to PT (or rather PT is related to the firmware upgrade) so that's why I mentioned it. If in the future the new ball comes with its own support for PT related tasks then use the simplest approach for now.

@dtran123
Copy link

dtran123 commented Apr 23, 2019

We should do baby steps and as per @TerryE 's suggestion, read or write the LFS or SPIFFS partition size is good enough for now in my opinion. Arguably that will cover 90% of usecases and lower the barrier of entry for most "casual" users. Anything more involved, the user has other options. For OTA upgrades, as long as the current approach/architecture is inline with future considerations..it should be fine.

BTW, I have been testing this new dev branch with SDK3.0...and happy to say that regression testing is going well so far. I have also been testing TLS connection scenarios with certificates ...so far it is working except for the scenarios involving CA certificate verification. (tls.cert.auth(true) works fine but with tls.cert.verify(true) as well, it fails...maybe due to demanding RSA but I am not sure at this point if tls.cert supports elliptic ciphers as the code appears to only support RSA but I will raise this in another thread/PR. My main concern with the new SDK3.0 was the latency causing possible new timeouts on TLS handshakes and cert verification.

@NicolSpies
Copy link

@dtran123, It will be great if you can also confirm the http.get bug under SDK3.0 that I mentioned 3 days ago in this thread.

@dtran123
Copy link

I usually use mainly MQTT. For HTTP REST API, I have been using basic TCP connections because so far it has served me well due to the memory hungry HTTP implementation but also the HTTP library doesn't handle large tokens very well (tokens > 1.5K in size). Now with LFS, I might try HTTP library again.

@TerryE
Copy link
Collaborator Author

TerryE commented Jul 21, 2019

I have decided to close this since the SDK seems as built seems to be stable and we've got to the point where we can't practically revert to a 2.2.1 version. Any of the commenters here are welcome to open specific issues going forward.

@TerryE TerryE closed this as completed Jul 21, 2019
@MaBecker
Copy link

It now integrates a version of @jmattsson Johny's unaligned exception handler which enables a lot of the SDK constant data to be moved in flash mapped memory and thus freeing up some 12Kb extra RAM for application use.

Hi TerryE,

can you please share where you found this.

@TerryE
Copy link
Collaborator Author

TerryE commented Jul 22, 2019

@MaBecker, on another thread I posted that this was a phantom saving. The version introduces an extra memory allocation mode (which doesn't work with our type of build) and this mis-reports free heap. The free memory is pretty much the same as 2.2.1.

@MaBecker
Copy link

MaBecker commented Jul 22, 2019

Thanks for your quick response! I tried to implement Johny's unaligned exception handler for Espruino, but have not been successful.

And if this was part it would have been extremely cool.

@TerryE
Copy link
Collaborator Author

TerryE commented Jul 22, 2019

Getting Espruino working on an esp8266 will be a bit of a hard work. Maybe the esp32 would be a better starting point, but that outside the scope of this list, eh?

@MaBecker
Copy link

Sorry @TerryE, it is running very stable on ESP8266.

You should give it a try ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests