-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lua 5.3 Compatible ROTables #2726
Comments
In essence this change involves changing every ROTable declaration in our source base, because at the moment the ROTable is just a Whilst PR #2505 introduced a cached lookaside that removed the need for the VM do a linear search of the ROTable in Flash for over 95% of table hits, a miss still requires a full table scan. In the case of metamethods like Doing this bulk conversion is something that I want to automate as much as practical and we need to have confidence that we haven't screwed up anything in the process. So we will do this change in two steps, plus an optional one:
I have done the bulk of the edits as a WIP. There are a few funnies such as the UCG declarations which require handcrafted edits. |
I suspect that @jmattsson is the only other committer tracking this but just another update dump. The issue of using the eLua The eLua build supported 9 various configuration variants through two macros One piece of smarts that Jonny introduced was the compile / link time method of declaring libraries and Rotables. For historic (eLua) reasons these use different declaration mechanisms. I have changed this so that everything now uses a global ROM ROTable. I've got the code morphed and compiling clean. Just need to fill in some of the low level changes and do the testing now. |
This sounds really good! |
Yup, I've got both the host and target compiling clean, and I need to start thorough testing on Mon (doing a family visit this long w/e). The idea is that we can switch in the Lua 5.3 engine with little or no further changes to the modules. One simplification that I have made is that we only use char[] keyed ROTables, so there is no point is supporting numeric and string keys, since this doubles the word probes into Flash that you need to do to find a given key. This is a trivial change to lrotables.h plus ~ 10 lines change in lrotables.c |
I like the incremental ease-in of this. Should make the eventual 5.3 switch much less risky. |
@jmattsson, I've just added a PR. Have a scan and give me your thoughts. Same goes for anyone else interested. I have given this a good exercise both with |
@jmattsson Johnny has done a code review, and very much appreciated. In terms of number of lines of changed, this was a biggy, but most of these were batch global edits which for the current macros didn't change the actual compiled code in the modules. There were some peephole optimisations to the ROTable scan algo which halved the number of word reads from flash per entry in a ROTable scan. Johny has picked up some points but none are material, especially given the size of the change. I have a SDK 3.0 PR (#2732) that I want to merge first. I will then rebaseline this against the current dev and include the points that Johny picked up. What I would like to do is to get this one merged ASAC, because this makes source changes to every module and will therefore block most other PRs. As I said in the intro post, once we have stabilised this change in dev, the next step is to implement the macro changes to add |
@jmattsson I've continued to look at the performance tuning here. What I find is quite counter intuitive. The linear search based on the1st word comparison is usually faster than the binary chop. The reason if that the linear search is a tight loop doing a work fetch every 4th word in the ROTable to do the scan, and since this is an equality match, little-endian vs big-endian issues are irrelevant. This runs fast. The binary search must do a ROM strcmp() per test and this has a far higher unit cost. So the fastest option seems to be binary chop until the search window is ~10 entries or less and then do a linear scan on this -- not that we have many ROtables with more than 10 entries. The lookaside cache is fast and effective but the major performance hit is actually on cache misses such as if the VM checks for a The bottom line is that ROTable access is now about 10× faster than the 1.5.x versions. I'll push an update soon, and I'll use the same routines for Lua 5,3 |
I've done another review cycle because my Lua 5.3 port is in progress. There's another tranche of minor updates here because some of the fields which are function no-ops in Lua5.1 are incorrect for Lua5.3. One minor issue is that perhaps we should rename See my Lua 5.3 port to NodeMCU Firmware for more details. |
This now implemented in our PR #2912, so I am closing this issue. |
New feature
The current NodeMCU modules use a ROTable implementation taken directly from the eLua implementation. This is a design compromise that does allow a form of RO Table to be implemented in a Flash memory compatible data structure, but in a way that gives poor runtime performance and uses an address-range test to determine whether a table pointer points to a
Table
record in RAM or a vector of ROtable entries in Flash.The serial search algorithm for ROTables gives poor access performance especially for table-miss accesses.
It is going to be very difficult to implement the same approach for Lua 5.3 which has already established an implementation architecture of using record variants for such subtypes.
Justification
So it makes sense to move to a Lua 5.3 compatible ROTable implementation as this will give performance benefits and this is the main aspect where our current Lua C API used in our modules will need reworking to be compatible with Lua 5.3. We can therefore prepare all of modules as a separate PR step to the Lua 5.3 migration itself. There are also performance benefits in doing this now.
Scope and impacts
In our
app/modules
we have the following numbers of ROTables in our C files: 45 × 1, 15 × 2, 4 × 3, 3 × 4, 1 × 5, 1 × 6. (e.g one modulenet.c
has 6 ROTable declarations.). We have another 6 inapp/lua
and 1 inapp/pm
These all need updating to the new format.The way to do this is to use a special to purpose Python conversion script that will automate the source conversion. So for example this will convert:
into the Lua 5.3 format:
Implementation notes and wrinkles:
__index
this involves a table scan which can fail and as a result is currently a full table scan. With this PR ROTables are now implemented as a variant Table record pointing to an orderedluaR_entry
vector. The Table record includes a pointer its metatable and if this is blank then no metatable search is required.dev-LROTABLE
branch to trial this one -- its not one that I would want to push intodev
without multiple testers checking it out.The text was updated successfully, but these errors were encountered: