Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sntp: simplify and tidy a bit #2700

Merged
merged 1 commit into from
Jul 27, 2019
Merged

sntp: simplify and tidy a bit #2700

merged 1 commit into from
Jul 27, 2019

Conversation

nwf
Copy link
Member

@nwf nwf commented Mar 19, 2019

  • list_ref can become LUA_REFNIL, because that's what rawgeti returns
    for LUA_NOREF. Defensively guard for this, rather than falling into
    the sntp_dolookups loop with nil on the stack.

  • set_repeat_mode should not call itself, but should rather always do
    what it's going to do and then optionally do the rest if directed.

  • sntp_sync should not try to special case the single string argument:
    we should be queueing that name for DNS resolution, too. Towards that
    end, if we are given a single string, build a table and make that the
    list_ref and call off to sntp_dolookups, just like we otherwise do.

FIXES: #2699

  • This PR is for the dev branch rather than for master.
  • This PR is compliant with the other contributing guidelines as well (if not, please describe why).
  • I have thoroughly tested my contribution.
  • The code changes are reflected in the documentation at docs/en/*.

Copy link
Member

@pjsg pjsg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks entirely reasonable. I haven't tested it....

Copy link
Member

@jmattsson jmattsson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me, but I haven't got the setup to test it at the moment.

if (enable) {
set_repeat_mode(L, FALSE);
repeat = (sntp_repeat_t *) c_malloc(sizeof(sntp_repeat_t));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably take the opportunity to change this to repeat = luaM_new(L, sntp_repeat_t); while we're at it. And of course, the c_free(repeat) above becomes a luaM_free(L, repeat);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can do, but... what's the advantage, sorry?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By using the Lua memory allocator we tap into its ability to automatically do a gc sweep if needed, reducing the likelihood of getting an out-of-memory in the first place, and if we do then the LVM will raise the error for us, so we don't need explicit null checking in our code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And of course, the c_free(repeat) above becomes a luaM_free(L, repeat);

luaM_freearray(L, repeat, sizeof(sntp_repeat_t)); to be precise 😉
Ref. #2377 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devsaurus Why? repeat has the correct type, so the resulting sizeof(*repeat) will yield the correct size in luaM_free(). It's only if you're casting things to a char * or are actually using arrays that you need the luaM_freearray() (or luaM_freemem()). Check lmem.h if you don't believe me! :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't been tracking this module or this dialog so I can't make any detailed review comments. I do that one general observation though: switching from the malloc allocator to the luaM allocator doesn't really impact leakage management either way. The latter helps the Lua allocator to track used storage, but not GC.

If you want the Lua GC to manage resource clean up then you need to use collectable objects such as Userdata (not Light userdata which is not collectable). Userdata can have a bound metatable with a __GC method defined so that you can carry out any cleanup needed if the GC was determined that the object is no longer referenced.

@nwf
Copy link
Member Author

nwf commented Mar 20, 2019

The more I look at sntp.c the less happy I get with it. Consider this an interim PR before a rewrite. :(

@jmattsson
Copy link
Member

I think it's grown somewhat organically >.>

@pjsg
Copy link
Member

pjsg commented Mar 20, 2019

I think that is an understatement!

@marcelstoer marcelstoer added this to the Next release milestone Mar 20, 2019
@nwf
Copy link
Member Author

nwf commented Mar 20, 2019

Oh, these things happen. I just mean that I'm not sure I see an easy way to reduce the opportunity for state machine confusion without a rather larger retooling. I'll try to get a PR up for review "soon".

@nwf
Copy link
Member Author

nwf commented Mar 26, 2019

I am slowly chiseling away at my to-be-proposed new sntp.c and I wonder: does it make sense to jettison the repeat logic? It'd simplify the module logic a little and there's very little expense in using cron or tmr to do the resync one's self.

@jmattsson
Copy link
Member

Other than for backwards compatibility I have no objections to that. I think when I first wrote the sntp module it was before the tmr module was fully "objectified" and we were already using all the timer slots at $work.

If we do remove the repeat functionality, we should at least print a warning message with a pointer to docs with example(s) on how to get the precise same functionality using Lua via tmr/cron.

@pjsg
Copy link
Member

pjsg commented Mar 27, 2019

I think that the repeat functionality should be part of the sntp module. To me, what it means is that the platform keeps good time -- even over deep sleep. I don't know whether the sntp (if repeat is specified) actually tries to establish the time when the platform is restarted after a deep sleep.

Also, if people use cron to trigger a resync, then there is the strong probability that people will try and sync on the hour. With the builtin repeat, this sort of synchronisation can be avoided.

@nwf
Copy link
Member Author

nwf commented Mar 27, 2019

@pjsg Fair points all; I'll keep it in.

Separately, perhaps it'd be nice if our cron module had a "randomly within the hour" kind of thing? (@hourly?)

@pjsg
Copy link
Member

pjsg commented Mar 27, 2019

We could adopt the jenkins crontab extension of using H in a field (which means pick s specific value based on the hash of the job's name). I think that we would use a random number. I.e.

H * * * * would be once per hour at a random minute past the hour
H H * * * would be once per day at a random time

The H would be evaluated once when the cron entry is created.

More interesting would be an extension to allow

H/5 * * * * every five minutes with a random offset.

@marcelstoer
Copy link
Member

Didn't make the cut obviously -> removing the milestone

@marcelstoer marcelstoer modified the milestone: Next release Apr 4, 2019
@TerryE
Copy link
Collaborator

TerryE commented Apr 5, 2019

I had a look at this code and we seem to be getting into all sorts of complexities because we are using C resource allocation strategies and getting into all sorts of issues to do with GC of resources. I do have to wonder why aren't just using the Lua VM to do all of this for us by using a proper Lua C approach to this and working with the Lua GC to let it do all of the heavy lifting and GC.

If anyone is interested in what I mean, I will be happy to explain, but in essence the SNTP object is static and so can be be held as a Lua table in the registry at a fixed handle. So long as all of the value entries in the table are Lua collectable or simple objects (strings, functions, numbers, userdata) (not lightuserdata) then the GC will correctly track their use and do the correct CG as needed. @nwf Nathaniel, I am happy to chat (voice or WhatSapp, ...) if you want me to explain further.

@jmattsson
Copy link
Member

I do have to wonder why aren't just using the Lua VM to do all of this for us

@TerryE Because way back when I wrote the first version of that module, I was still very much a noob in the ways of extending Lua :D

@TerryE
Copy link
Collaborator

TerryE commented Apr 6, 2019

@jmattsson, To be honest, it took years of crawling through the VM for the penny to drop for me.

This one is really quite complicated because of all the CBs and not layering it on net. Doing a nslookup on *.nodemcu.pool.ntp.org they are all IP lists but our DNS resolve only returns the first.

Quite honestly if I were doing this from scratch, I'd adopt a KISS approach: stick with just UDP; call net.udpsocket:listen() and :send() instead of reimplementing all of this housekeeping. Then loop around the list resolve,send,receive with timeout bumping to the next on the list, use a Lua array in the registry to store all of the state so that GC is all done for you.

But the problem with this approach is that it is major surgery / rework even though the final module would be half the size. An exercise for Nathaniel perhaps if he has the enthusiasm. I haven't.

Maybe sticking plaster is the easiest thing to do here as we have other low hanging fruit to pick. 😄

@nwf
Copy link
Member Author

nwf commented Apr 6, 2019

@TerryE: I was loathe to change too much about the core SNTP packet construction/send/recv/parse logic, but otherwise that's kind of the tack I'm taking. I've pushed a very much in progress version of the rewrite effort to https://github.com/nwf/nodemcu-firmware/blob/sntp/app/modules/sntp.c . The block comments at lines 37 and 151 give a high-level overview of the approach. Comments and/or offers to take this off my plate welcome! ;)

@marcelstoer
Copy link
Member

In light of #2819 do you want to keep this open?

@nwf
Copy link
Member Author

nwf commented Jul 26, 2019

I think it might be good if this shipped in the next master drop, because it does fix things in sntp. #2819 is going to take longer to get to achieving its goals.

@marcelstoer
Copy link
Member

I agree, will you fix the conflicts in sntp.c then?

* list_ref can become LUA_REFNIL, because that's what rawgeti returns
  for LUA_NOREF.  Defensively guard for this, rather than falling into
  the sntp_dolookups loop with nil on the stack.

* set_repeat_mode should not call itself, but should rather always do
  what it's going to do and then optionally do the rest if directed.

* sntp_sync should not try to special case the single string argument:
  we should be queueing that name for DNS resolution, too.  Towards that
  end, if we are given a single string, build a table and make that the
  list_ref and call off to sntp_dolookups, just like we otherwise do.

FIXES: nodemcu#2699
@nwf
Copy link
Member Author

nwf commented Jul 27, 2019

Done. :)

@marcelstoer marcelstoer merged commit 7d387dd into nodemcu:dev Jul 27, 2019
@nwf nwf deleted the sntp-tidy branch August 22, 2019 12:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants