Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tie in the EGC with the SDK's heap knowledge. #2319

Merged
merged 1 commit into from
Apr 6, 2018

Conversation

jmattsson
Copy link
Member

Added node.egc.meminfo() to expose LVM usage (to make the regular
node.egc.ON_MEM_LIMIT option usable).

Extended the node.egc.ON_MEM_LIMIT option to also take negative limits,
in which case that's taken as a request to keep a certain amount of heap
available for non-Lua use.

  • This PR is for the dev branch rather than for master.
  • This PR is compliant with the other contributing guidelines as well (if not, please describe why).
  • I have thoroughly tested my contribution.
  • The code changes are reflected in the documentation at docs/en/*.

This came up in the discussion with Terry last week as something we want in mainline. Since we already have this here at $work it seemed worthwhile to submit it. I have no doubt that Terry with his knowledge of the GC might come up with a more elegant approach. Until such a time however, this patch provides a couple of useful features with minimal changes. Using the ON_MEM_LIMIT option with a negative limit avoids a lot of GC overhead and at worst degenerates to the same performance as ALWAYS.

Marcel, the docs phrasing could be improved - I wrote that for internal consumption here at $work, so probably not as clear & friendly as it could be.

Added `node.egc.meminfo()` to expose LVM usage (to make the regular
`node.egc.ON_MEM_LIMIT` option usable).

Extended the `node.egc.ON_MEM_LIMIT` option to also take negative limits,
in which case that's taken as a request to keep a certain amount of heap
available for non-Lua use.
@jmattsson jmattsson requested review from marcelstoer and TerryE March 26, 2018 02:24
Copy link
Member

@marcelstoer marcelstoer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing wrong with the documentation - clear & friendly in my opinion.

@TerryE
Copy link
Collaborator

TerryE commented Mar 26, 2018

Johnny this is one aspect of the tweak that we discussed and thanks for this. Of the three modes

  • ALWAYS is the most robust but causes your ESP to run like a dog.
  • ON_ALLOC_FAILURE is a leave it to the last minute option -- which would be OK if the libraries all used the Lua allocator or at least called the Lua GC before allocating memory.
  • ON_MEM_LIMIT has two failings, the first of which is addressed by this version of the patch, which is that the parameter refers to the memory used rather than the memory remaining, and you fix this here. Can I suggest that we also change the limit from a hard one to a soft one? That is at the moment, the GC only runs if the available heap crosses a threshold but then fails if it ca't free up across the threshold. I see that a more robust mode would be switch into ALWAYS mode until we cross the threshold.

So by way of example the GC doesn't bother to collect garbage until we only have 4kb remaining, say. Normally, this will free up lots of memory so the collector wil again back off for a period. However, if the free is still less than 4K it continues to collect on every allocation event until the free level falls to below 4K or we get a hard memory exhaustion.

There's another reason for the soft fail: because of the three-colour marking system, it takes two sweeps to guarantee recovering storage.


What isn't well documented is how the Lua GC works. it does so in one of two modes: full and conditional, initiated by luaC_fullgc () and luaC_checkgc () respectively.

  • luaC_fullgc() if initiated by the collectgarbage() function with a nil or "collect" argument. The other paths is view the eLua EGC patch which initiates full collection on specfic Emergency alloc events according to the EGC mode parameter as discussed above.
  • luaC_checkgc () events are scatters through the Lua RTS and pace the normal collection process.

My point is the Lua carries out GC anyway ,and these parameters really dictate what extra to do in emergency. As far as the standard functionality goes to quote the manual:

Lua performs automatic memory management. This means that you have to worry neither about allocating memory for new objects nor about freeing it when the objects are no longer needed. Lua manages memory automatically by running a garbage collector from time to time to collect all dead objects (that is, objects that are no longer accessible from Lua). All memory used by Lua is subject to automatic management: tables, userdata, functions, threads, strings, etc.

Lua implements an incremental mark-and-sweep collector. It uses two numbers to control its garbage-collection cycles: the garbage-collector pause and the garbage-collector step multiplier. Both use percentage points as units (so that a value of 100 means an internal value of 1).

The garbage-collector pause controls how long the collector waits before starting a new cycle. Larger values make the collector less aggressive. Values smaller than 100 mean the collector will not wait to start a new cycle. A value of 200 means that the collector waits for the total memory in use to double before starting a new cycle.

The step multiplier controls the relative speed of the collector relative to memory allocation. Larger values make the collector more aggressive but also increase the size of each incremental step. Values smaller than 100 make the collector too slow and can result in the collector never finishing a cycle. The default, 200, means that the collector runs at "twice" the speed of memory allocation.

You can change these numbers by calling lua_gc in C or collectgarbage in Lua. With these functions you can also control the collector directly (e.g., stop and restart it).

The default parameters for the incremental collection are:

  • Step size= 1024. Perhaps this should be smaller on a 44Kb heap.
  • Pause = 110%, that is pause until the memory grows by 10% before the next GC step.
  • Stepmul = 200%, that is there are roughly two steps per significant event

Significant events are include any call or return, lua stack operations on GCObjects, and a few less freequent functions.

So for me, on thinking about this, the tl:dr is that I feel that we should document how to configure the standard GC and leave the EGC for just that: emergency use, but this all being said, I think that having ability set the emergency memory limit relative to bytes remaining would be very valuable.

@jmattsson
Copy link
Member Author

@TerryE I honestly don't know how to do a "soft fail" here, that is beyond my ken. You're welcome to push amendments here, but I'll have to revoke my "thoroughly tested" guarantee on those parts =)

@@ -82,7 +82,7 @@ typedef struct global_State {
Mbuffer buff; /* temporary buffer for string concatentation */
lu_mem GCthreshold;
lu_mem totalbytes; /* number of bytes currently allocated */
lu_mem memlimit; /* maximum number of bytes that can be allocated, 0 = no limit. */
l_mem memlimit; /* maximum number of bytes that can be allocated, 0 = no limit. <0 used with EGC_ON_MEM_LIMIT when free heap falls below -memlimit */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Johny, you can't overload the g->memlimit this way without going through all of the places in the core which use it. For example, it is also used in lapi.c and lvm.c

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Through the wonders of int -> unsigned conversion, I can :)

In lapi.c the only use is through collectgarbage("setmemlimit", newlimit). I did not change it because I see no reason to. Using a negative memlimit is an extension available through node.egc.setmode() only. Setting a regular hard memlimit via collectgarbage() is still possible (and if you want, you could force negative values through it, but no sane value on the ESP8266 would interfere).

In lvm.c I left it alone because a negative memlimit expressed as an unsigned is effectively the same as MAX_SIZET on our tiny architecture.

Copy link
Collaborator

@TerryE TerryE Mar 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmattsson Johny, IMO, you aren't fully correct in your assertion. G(L)->memlimit is a single global uint variable that was introduced in the eLua patch but is used in more than just the EGC threshold calcs.

  • In lvm.c:luaV_concat(), the maximum string length allowed is the lesser of MAX_SIZET and G(L)->memlimit. this this is negative then we will be checking the string length against a negative value and always throw a "string length overflow" error. This logic needs fixing or dropping.
  • the GC setmemlimit options is an eLua/EGC addition and is an alternative API for setting this ECG variable. memlimitisn't used by the regular GC, only the EGC add-on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, the negative G(L)->memlimit gets assigned into an unsigned, at which point it becomes a very large positive number, which on the ESP8266 is way above the available memory that the result is effectively the same as MAX_SIZET. Try it for yourself if you don't believe me :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I agree. the size of a signed vs unsigned comparison or coerced to an unsigned comparison, but looking at this code I really have to wonder what the rationale for the EGC patch change was. Why include a bounds check to ensure that the size of a concat string is < MAXUINT - a tiny bit?

However, that is your problem, so I accept your comment.

@@ -791,6 +792,11 @@ static void *l_alloc (void *ud, void *ptr, size_t osize, size_t nsize) {
}
if (L != NULL && (mode & EGC_ALWAYS)) /* always collect memory if requested */
luaC_fullgc(L);
#ifndef LUA_CROSS_COMPILER
if (L != NULL && (mode & EGC_ON_MEM_LIMIT) && G(L)->memlimit < 0 &&
(system_get_free_heap_size() < (-G(L)->memlimit)))
Copy link
Collaborator

@TerryE TerryE Mar 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to ask expensive in terms of clock cycles is the system_get_free_heap_size() until I realised that I should answer this myself since I've got a remote debug session onto a running firmware. The call is a wrapper around a getter in libmain.a(mem_manager.o) so 7 instructions and the value itself is maintained by the memory manager in the 0th word of its .bss, so the tl;dr is that the call it dirt cheap.

@TerryE
Copy link
Collaborator

TerryE commented Mar 26, 2018

The reason for my diatribe

... Of the three modes ...

is that I've been thinking about this whole issue of the GC and how to configure it for optimum use. I know the core and have done some playing, and understand a lot of it. For most non-IoT Lua developers aggressive memory collection is just not an issue so the default trickle settings are fine.

However the default almost paranoid way that we run with might have been understandable in the 15Kb RAM avialable but is just crazy with current builds and even more so with LFS. a full GC is a pretty expensive operation, so its should only be done as a matter of last resort.

If you have an app that runs in say 35K RAM so you have ~10K freeboard, then the standard GC with sensible settings will do a pretty good job of keeping collection under control with minimal runtime cost. This all being said, I am happy with this patch so long as you cover off the uses of G(L)->memlimit in lvm.c and lapi.c.

@TerryE
Copy link
Collaborator

TerryE commented Mar 30, 2018

So this patch now has a clean bill of health from me, but I think that we want to leave merging it until after the 2.2 drop.

@marcelstoer marcelstoer added this to the 2.2-follow-up milestone Apr 6, 2018
@marcelstoer marcelstoer merged commit 085f35d into nodemcu:dev Apr 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants