use random:uniform instead of os:pid when constructing node name in nodetool #868

hmmr · 2021-05-11T17:20:39Z

Borrowing from https://github.com/basho/node_package/blob/4.0/priv/base/nodetool#L195, this is to help reduce the risk of hitting the atom table limit, as was reported by one of our customers who was calling riak-admin continuously and frequently enough to trigger the atom table overflow.

…in append_node_suffix Borrowing from https://github.com/basho/node_package/blob/4.0/priv/base/nodetool#L195, this will help reduce the risk of hitting the atom table limit, as was reported by one of our customers who was calling riak-admin continuously and frequently enough to trigger the atom table overflow.

tsloughter · 2021-05-11T17:56:29Z

This looks good, thanks. I'll probably merge soon.

But I wanted to note that in OTP 23+ nodetool is no longer used and this issue does not exist. Obviously still worth it to be fixed for those using pre-23, just wanted to mention it :)

hmmr · 2021-05-11T18:08:51Z

@tsloughter Indeed, I read that note and slightly pondered if it's worth while bothering. But, on reflection, it seems it still does :)

tsloughter · 2021-05-12T12:45:21Z

Hm, the shelltestrunner tests and the tests on windows fail.

Bob-The-Marauder

Should random be replaced by rand as random is deprecated?

ferd · 2021-05-12T14:26:55Z

Yes, definitely should replace with the newer stuff where available.

Instances of nodetool generate random node name suffxes to facilitate running multiple simultaneous calls in parallel. However, each time nodetool connects to the target node, a new atom is created on the latter. If this happens frequently and/or long enough, it will eventually crash the node as it hits the atom table limit. As a workaround, if the caller can guarantee calls are serialized and isolated in time, defining an env variable $NODETOOL_NODE_PREFIX will create identical atoms for node name prefix, thus avoiding generation of new atoms. The proposed change is complimentary to erlware#868, aiming to address the issue, reported by one of our customers, in which a riak node hit the atom table limit (yes, all of 1M+ entries) and crashed. A postmortem showed the table filled with `[email protected]`, accumulated over a period of time resulting from calls to `riak admin status` every 5 min. Note that I did not attempt to do any changes that may need to be done, to the same effect, in extended_bin_windows, as it's not straightforward for me which they would be (my knowledge of scripting in Windows is some 30 year old).

hmmr · 2021-07-15T14:09:59Z

@tsloughter After the approval, what is the current state of this PR? Is there anything I can do to help?

tsloughter · 2021-07-15T23:12:33Z

Hey, sorry about that. I don't know what the hell is going on with CI... there is at least 1 other PR that should be passing CI but isn't that I also want to merge and cut a release with.

tsloughter · 2021-07-15T23:33:09Z

Could you repush so it kicks of CI again? There isn't even a "rerun" option anywhere like there usually is...

hmmr · 2021-07-16T02:35:27Z

Once it's in, there's a more substantial #871.

Bob-The-Marauder mentioned this pull request May 12, 2021

Atom table limit hit if riak admin called regularly basho/riak#1066

Open

Bob-The-Marauder reviewed May 12, 2021

View reviewed changes

amend 6e4368a to use rand instead of deprecated random

d94d990

tsloughter mentioned this pull request May 14, 2021

remote_console cause atom leaks on target node #732

Closed

tsloughter approved these changes May 14, 2021

View reviewed changes

hmmr mentioned this pull request May 24, 2021

optionally allow static node name prefixes #871

Merged

a non-essential change to force github CI to rebuild

ea96033

tsloughter approved these changes Aug 25, 2021

View reviewed changes

tsloughter merged commit e6c3ae4 into erlware:master Aug 25, 2021

hmmr deleted the patch-1 branch August 27, 2021 13:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use random:uniform instead of os:pid when constructing node name in nodetool #868

use random:uniform instead of os:pid when constructing node name in nodetool #868

hmmr commented May 11, 2021

tsloughter commented May 11, 2021

hmmr commented May 11, 2021

tsloughter commented May 12, 2021

Bob-The-Marauder left a comment

ferd commented May 12, 2021

hmmr commented Jul 15, 2021

tsloughter commented Jul 15, 2021

tsloughter commented Jul 15, 2021

hmmr commented Jul 16, 2021

use random:uniform instead of os:pid when constructing node name in nodetool #868

use random:uniform instead of os:pid when constructing node name in nodetool #868

Conversation

hmmr commented May 11, 2021

tsloughter commented May 11, 2021

hmmr commented May 11, 2021

tsloughter commented May 12, 2021

Bob-The-Marauder left a comment

Choose a reason for hiding this comment

ferd commented May 12, 2021

hmmr commented Jul 15, 2021

tsloughter commented Jul 15, 2021

tsloughter commented Jul 15, 2021

hmmr commented Jul 16, 2021