Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MCP23017 module refactorings to save some heap #3317

Merged
merged 2 commits into from
Oct 29, 2020
Merged

Conversation

nwf
Copy link
Member

@nwf nwf commented Oct 25, 2020

  • This PR is for the dev branch rather than for master.
  • This PR is compliant with the other contributing guidelines as well (if not, please describe why).
  • I have thoroughly tested my contribution.
  • The code changes are reflected in the documentation at docs/*.

Experimentally, I made some changes I had suggested for #3197 and would like to report my findings for consideration and possible merge. I recognize the danger of violence applied to deceased equines, however I continue to feel that there is significant low-hanging technological debt within our tree and I would like new modules to be exemplars of not creating more.

Heap measurements were taken by quiescing the ESP as much as possible (no Lua tasks or timers, wifi.sta.disconnect()) and taking the maximum of five outputs of =node.heap() across several seconds after each line of input; at startup, there are 44488 bytes of heap available. The ESP is running Lua 5.3 with the garbage collector policy modified by collectgarbage("setpause", 0); collectgarbage("step", 100), which should have the effect of reclaiming garbage as aggressively as possible. The Lua module is given to the ESP as a file in SPIFFS; LFS is likely to have no significant impact on the conclusions of these experiments, as we are primarily measuring the impact of necessarily dynamic structures on the heap (local values and closures).

As merged

The module was modified to not need hardware available to it, using this hack. This slightly reduces the code size, but not in any way likely significant to the conclusions drawn from the experiment.

diff --git a/lua_modules/mcp23017/mcp23017.lua b/lua_modules/mcp23017/mcp23017.lua
index 47aedccf..ceff4b8e 100644
--- a/lua_modules/mcp23017/mcp23017.lua
+++ b/lua_modules/mcp23017/mcp23017.lua
@@ -62,10 +62,7 @@ mcp23017.__index = mcp23017
 
 -- check device is available on address
 local function checkDevice(address, i2cId)
-    i2c.start(i2cId)
-    local response = i2c.address(i2cId, address, i2c.TRANSMITTER)
-    i2c.stop(i2cId)
-    return response
+  return true
 end
 
 -- write byte
@@ -111,12 +108,14 @@ local function setup(address, i2cId)
         error("MCP23017 address is out of range")
     end
 
-    if (checkDevice(address, i2cId) ~= true) then
-        error("MCP23017 device on " .. string.format('0x%02X', address) .. " not found")
-    else
-        reset(address, i2cId)
-        return 1
-    end
+    return true
+
+    -- if (checkDevice(address, i2cId) ~= true) then
+    --     error("MCP23017 device on " .. string.format('0x%02X', address) .. " not found")
+    -- else
+    --     reset(address, i2cId)
+    --     return 1
+    -- end
 end
 
 return function(address, i2cId)
command heap (delta)
x = loadfile 37944 (6544)
y = x() 36408 (1536)
z1 = y(32, 0) 35504 (904)
z2 = y(33, 1) 34472 (1032)

Summary: there are roughly 6.5KB of code loaded and 1.5K of dynamic structure created for the module itself (e.g., the metatable, top-level locals, and any initial closures constructed). Each instance of the MCP23017 driver occupies around 1K of heap.

Commit 1 ("functions to metatable")

This commit mostly moves functions from instances to the module's metatable and avoids creating closures. Some locals holding constants are eliminated, which additionally reduces the size of the closures that do get created (now mostly at module instantiation time, not instance construction time).

Again the module was modified to not need hardware:

diff --git a/lua_modules/mcp23017/mcp23017.lua b/lua_modules/mcp23017/mcp23017.lua
index bde1a080..e5bfe959 100644
--- a/lua_modules/mcp23017/mcp23017.lua
+++ b/lua_modules/mcp23017/mcp23017.lua
@@ -169,11 +169,11 @@ return function(address, i2cId)
         error("MCP23017 address is out of range")
     end
 
-    if (checkDevice(address, i2cId) ~= true) then
-        error("MCP23017 device on " .. string.format('0x%02X', address) .. " not found")
-    end
+    -- if (checkDevice(address, i2cId) ~= true) then
+    --     error("MCP23017 device on " .. string.format('0x%02X', address) .. " not found")
+    -- end
 
-    reset(address, i2cId)
+    -- reset(address, i2cId)
     return self
 end
 
command heap (delta)
x = loadfile 38024 (6464)
y = x() 36408 (1616)
z1 = y(32, 0) 36224 (184)
z2 = y(33, 1) 35968 (256)

Summary: No significant changes to code size or initial module construction, but substantial reduction in per-instance costs by avoiding the creation of closures for each function. The difference between first and second is likely just sampling noise; nevertheless, this delta achieves a heap savings of approximately 748 bytes per instance on average (~ 75% reduction).

Commit 2 ("inline constants")

This is a more, and possibly unacceptably, aggressive inlining of constants instead of using locals. The savings are not overwhelming, but, IMHO, still worth accepting into the tree. The savings would be more substantial if the features corresponding to the other registers were also implemented, and so I view the use of inline constants as more scalable than locals.

The same no-hardware diff as above was applied.

command heap (delta)
x = loadfile 38280 (6208)
y = x() 36952 (1328)
z1 = y(32, 0) 36776 (176)
z2 = y(33, 1) 36496 (280)

Summary: We see that this gives a further saving of approximately 256 bytes of code (insignificant with LFS deployed) but also 288 bytes of heap at module construction time, or about 0.5% of the total heap space available at module startup. Per-instance overheads are not significantly changed (and are again probably just sampling noise).

Summary

The mcp23017 module as merged is likely fine, and at worst a tad overzealous in its heap usage, for the most common scenario of one MCP23017 in use. However, it has poor scaling linear factors in two dimensions: 1) if instantiated multiple times, the per-instance overhead is excessive, and 2) as more of the chip's features are elaborated into Lua, the use of on-heap locals to hold integer constants will bloat the dynamic heap rather than the code size (which can be borne by LFS).

Why do I care so much, anyway? As the proposed #2983 test environment and harness uses a MCP23017 for GPIO testing, it would be nice to replace the bespoke driver in that series with this now-mainlined module. But testing NodeMCU frequently runs up against memory limitations and not exacerbating those would be quite nice.

@nwf nwf requested a review from marcelstoer October 25, 2020 14:43
@nwf nwf requested a review from galjonsfigur October 25, 2020 14:51
@nwf
Copy link
Member Author

nwf commented Oct 25, 2020

Commentary from @plomi-net would also be welcome, even though I cannot name them as a reviewer.

Copy link
Member

@galjonsfigur galjonsfigur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from couple of small issues, I think that is an improvement to the module - separated functions are easier to locate and inlining register values should be fine as long as there will be comments what that value represents. The size difference is measurable and those changes didn't broke the mcp23017_example.lua test with LEDs and switches.
BTW: When I was testing this I made a simple C function to check sizes of the objects based on lua-getsize gist here and while far from perfect it could be useful. Also http://luac.nl proved to be useful when checking how Lua interprets things.

local MCP23017_DEFVALA = 0x06
local MCP23017_DEFVALB = 0x07
local MCP23017_GPIOA = 0x12
local MCP23017_GPIOB = 0x13
--[[
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if keeping the rest of register addresses in comment is worth it if the other ones are inlined. I think that keeping a reference to a document where that table can be found would be more useful (eg. MCP23017 datasheet, page 12, Table 3-1) or even a link to the said datasheet (but those can become 404 and finding datasheets for popular ICs isn't a problem)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@galjonsfigur If someone uses the MCP23017, they will have the datasheet too. Theoretically the registers can be removed. I wouldn't insert a link because the link is in short time invalid 🤔

local self = setmetatable({}, mcp23017)

self.address = address
self.i2cId = i2cId
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those lines can be moved below the checks for correct address and device - it doesn't really matter but I think checking-then-doing is a clearer pattern to follow.

local self = setmetatable({}, mcp23017)

self.address = address
self.i2cId = i2cId

-- check device address (0x20 to 0x27)
if (address < 32 or address > 39) then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that if the module fails to setup properly (fail in address range check or in checkDevice) it should return nil like in the previous version of the code - without that it's possible to setup module with address out of possible range and that definitely shouldn't return the same thing as correctly initialized module.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think I am misunderstanding your comment. The version of the code now in dev errors out in these cases; it did not, AFAICT, "return nil" despite appearances: setup never returned false. This patch preserves those behaviors (or, well, is at least meant to; did I break something?).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setup never returned false

I missed it completely! Now I wonder if that should be the case - if the module is not initialised correctly then it shouldn't return the same thing as correctly initialized one. It this a desired behaviour @plomi-net ? I think this can both use error and return nil but that would be up to discussion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a mistake. Of course, in case of error nil can be returned. I assumed that throwing an error would be enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am inclined to think that throwing an error is also enough, yes.

writeByte(address, i2cId, MCP23017_IODIRB, 0xFF)
function mcp23017:writeIODIR(bReg, newByte)
writeByte(self.address, self.i2cId,
bReg and 0x1 --[[const: MCP23017_IODIRB]] or 0x0 --[[const: MCP23017_IODIRA]], newByte)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: This way of doing inline comments is clever, but I think instead of const: MCP23017_IODIRB it could be for example IODIRB register - shorter and should convey the same information.

nwf added 2 commits October 26, 2020 18:18
Avoids closures for each module.
Saves nearly half a kilobyte of heap.
@plomi-net
Copy link
Contributor

@nwf @galjonsfigur I think the improvements are good. Especially the multi-instance operation should not need a lot of RAM, even if the use case in reality only needs one instance.
Thanks for the help. It will be fine.

@nwf
Copy link
Member Author

nwf commented Oct 28, 2020

People happy for this to land?

@nwf nwf merged commit f5665ac into nodemcu:dev Oct 29, 2020
@nwf nwf deleted the mcp23017-heap branch October 29, 2020 12:59
marcelstoer pushed a commit that referenced this pull request Nov 7, 2020
* mcp23017: functions to metatable

Avoids closures for each module.

* mcp23017: inline constants

Saves nearly half a kilobyte of heap.
vsky279 pushed a commit to vsky279/nodemcu-firmware that referenced this pull request Nov 21, 2020
* mcp23017: functions to metatable

Avoids closures for each module.

* mcp23017: inline constants

Saves nearly half a kilobyte of heap.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants