-
Notifications
You must be signed in to change notification settings - Fork 251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix use of rand + improve entropy-exhaustion #1051
Conversation
366793c
to
7a0c955
Compare
Well 💩 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NAK at this point; this might be right but to me it’s not obvious, and AFAICS the PR nor the issue explain exactly how this helps.
Overall I’m unhappy with the GenerateNonCryptoID
concept; it seems to basically willfully ignore the trade-offs, promising a “unique” ID on top of a non-cryptographic RNG with a 64-bit seed, which is just not a big enough identifier space.
(I’m also a bit unhappy about merely an import of a package changing the seed of the global math/rand
state. The code could just as well create its own random reader without polluting that. But, sure, it probably doesn’t hurt anything observable.)
Given the tradeoff as it exists, i.e. that we don’t want to rely on /dev/urandom
, I think any consumer of this code must check for duplicates, and handle by retrying — as some, but not all, do. And then we don’t need to tinker with the exact likelihood of a collision, as long as it is significantly below 1, we are fine, and there will be no flakes, not just a decreased likelihood of flakes.
A duplicate check is a bit difficult with AF_UNIX
sockets, because ordinarily they must be removed before Bind
creates them, i.e. it isn’t possible to “pre-allocate” a socket and pass it to a service process as a path to be created, without races, but we could easily enough create empty $UUID.lock files — whoever manages to create one first, “owns” the $UUID and can use a guaranteed-unique $UUID.socket path.
Ultimately in this specific code path (for the record, https://github.com/containers/podman/blob/e63e90999c88c7e8c9193c98df962e47001adb0f/test/e2e/common_test.go#L294-L302 ), I don’t think we need to rely on randomness to avoid collisions even a little. AFAICS:
- Each test could, in something like
SynchronizedBeforeSuite
, create a directory usingMkdtemp
(OK, that one uses randomness, but it is known to work — because it retries on collisions) - Ginkgo will spawn a number of processes; each can be uniquely identified using
config.GinkgoConfig.ParallelNode
- Within that process, just have a static static per-process counter variable (protected by a mutex, to be safe)
- Combining the three, $tempDir/$node-$perProcess would AFAICS generate a guaranteed-unique socket path (or I’m missing something and the process tree is actually more complex).
So, given the two alternatives above, which respect the limitations of GenerateNonCryptoID
and are not inherently flaky, at least with what I can now see, I don’t think tweaking a random seed somewhere deeper in the call stack, with a (to me) very-hard-to-quantify difference in a likelihood of a flake, but not eliminating this, is useful.
"time" | ||
) | ||
|
||
const shortLen = 12 | ||
|
||
var ( | ||
randLock sync.Mutex // No math/rand method should ever be called concurrently |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This lock is unnecessary; math/rand documentation says
The default Source is safe for concurrent use by multiple goroutines, but Sources created by NewSource are not.
where math/rand.Read
and math/rand.Seed
refer to the global source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think you're saying...we should not be setting a seed at all, and just rely on the default state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this review comment I’m saying that the randLock
variable doesn’t need to exist; it can just be dropped and the pre-existing non-locking can continue to be used without issues, making the rest of the PR smaller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the thing that got me thinking we needed locks:
https://pkg.go.dev/math/rand#Rand.Read
"Read should not be called concurrently with any other Rand method."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but that’s rand.New().Read
, not rand.Read
. We AFAICS use the latter one, https://pkg.go.dev/math/rand#Read .
pkg/stringid/stringid.go
Outdated
// This is slow and imperfect, but preferable to hanging (indefinately) | ||
// for the entropy pool to fill. | ||
randLock.Lock() | ||
seed = time.Now().UnixNano() + int64(syscall.Gettid()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t really see how this helps; the UnixNano
seed has a 1e9 range, and this just slightly pertubes the value in that range but doesn’t change the range in any way. So, supposing the time window for two concurrent processes is, I dunno, 10 ms, and there is a 1/10e6 chance of a collision, AFAICS there should be almost exactly a 1/10e6 chance of collision after we add the TID, which is likely to be the same value plus or minus at most a few hundred. So how does this make a difference?
For this approach to a fix I’d like to see a much more detailed explanation for how this fails (if the theory is that there are two processes that independently run this code and arrive at the same seed, how is such a prima facie extremely unlikely thing happening? Do we know how the cryptoseed
initialization above fails, in the first place?), probably an explanation that somehow explains how the nanosecond values are actually multiples of some large number, and adding the smaller-granularity TID value actually adds a notable number of bits to the entropy of the seed, decreasing the likelihood of failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how is such a prima facie extremely unlikely thing happening
And happening repeatedly, on multiple OSes and multiple versions
Do we know how the cryptoseed initialization above fails
We do not, we're assuming kernel entropy is exhausted and my (bad) test-code analysis shows there is no other way for the same socket ID to clash across multiple tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh...I should add: The test VMs most certainly do not have access to the hardware RNG. We're relying on the rngd
service in Fedora, and nothing but luck in Ubuntu 😭 So it's easily possible just running the podman tests is soaking up all the entropy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m out of touch, but I thought that /dev/urandom
reads don’t fail — they could block, but not fail. And the implementation linked at https://cs.opensource.google/go/go/+/refs/tags/go1.17.2:src/crypto/rand/rand_unix.go doesn’t show any use of O_NONBLOCK
anyway, I think.
So, some evidence to see that we are indeed taking this code path, and what the failure is, could be interesting.
Similarly, collecting a few hundred seed values (on the target platform) and seeing whether they happen to have suspiciously non-uniform value distribution, might be informative.
(Yeah, I could run these experiments myself, and I didn’t.)
Ultimately, understanding how exactly this happens would tickle my curiosity and interest, but I just don’t think it’s productive. If we add a duplicate check+retry to the caller, we can have pretty much any weak RNG here, even with a probability of collisions in the order of 1/1000, and it would work reliably. To me that’s a much more useful change than understanding in detail, whether the probability of collisions is 1/1e9, 1/1e6, 1/1e7, 2^-32, 2^-64 or something weird, and still having, in principle, the risk of occasional failures, however exactly we tune this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn’t show any use of O_NONBLOCK anyway, I think.
It is also my understanding as well that /dev/urandom
will block if there isn't enough entropy. I haven't looked at the code at all (it scares me), I'm assuming somebody did when they added the error check and time.nano()
fallback.
Since you know this code-base, would it make sense at all to issue some kind of warning (to the logs) when the fallback option is chosen? That would at least increase observability of the actual underlying error.
To me that’s a much more useful change than understanding in detail
Precisely, this PR isn't about making a better 20-sided die, it's about preventing process-to-process collisions. If that is needed, and can be done centrally, it will benefit all callers of this function.
That’s a Linux-only system call, and the failure is a macOS cross-compilation. |
The seeding happens in an |
Thanks, yeah I really don't 100% know what I'm doing here. But I can swap tid->pid (assuming that's what you mean). |
7a0c955
to
8f515b4
Compare
If it helps (explain) at all: I think this code originally came from "another container related project" which we try to resemble. |
Yeah, and I suppose that’s fair for things like container IDs (although there probably should be duplicate checks everywhere). There isn’t that kind of tie for the remote socket paths — AFAICS we just need unique paths, and everything else (using random values, encoding as hexadecimal, using this particular implementation) are just implementation choices we don’t have to make. |
I agree with you. I just don't know ginkgo (or golang) well enough to say how the podman remote tests are failing like this at all, nor what's the best way to fix them. I was hoping maybe this would be an easy/cheap fix, but as you rightly point out, I'm just throwing noodles at the wall to see which ones stick. I've got the issue on Brent's radar, and he's said he'll take a look as soon as he can, so I guess we just "hurry up and wait". |
Possible fix for containers/podman#12155 The golang docs for `math/rand` specifically mention that `rand.Seed()` should never ever be used in conjunction with any other rand method. Fix this with a simple/local mutex to protect critical code sections. This could be made more safe by exposing the mutex to downstream callers. This is left up to a future commit as/if needed. Also, in entropy-exhaustion situations it's possible for multiple concurrent *processes* to obtain the same fallback seed value, where the lock will not provide any protection. Clashes here are especially bad given the large number of downstream users of `GenerateNonCryptoID()`. Since the Linux kernel guarantees process ID uniqueness within a reasonable timespan, include this value into the fallback seed (along with the time). This is *not* a perfect solution, it is still possible for two processes to generate the same fallback seed value, given extremely unlucky timing. However, this is an improvement versus simplistic reliance on the clock. Signed-off-by: Chris Evich <[email protected]>
8f515b4
to
8d0afbb
Compare
I'm coming into this late, and the whole thread is overwhelming. AIUI the problem is that |
That's the theory, surrounded by other theories, based on my (probably bad) code analysis, based on seeing the same GenerateNonCryptoID() ID value generated across multiple tests (running in parallel). That's why we need to wait on this PR for a deeper analysis of containers/podman#12155 This could easily be an issue relating to some esoteric Ginkgo thing or the way our tests are written: Case-in-point - I haven't observed (for example) a socket ID clashing with a container ID. Always socket ID to socket ID. (Though I appreciate your attention and suggestion) |
After writing so many words, let me write some code to avoid the collisions… if we still see collisions afterwards, we’ll know that the ID generation is innocent and it is some reuse within the test infrastructure. |
I do have a commit in a PR that seems (somehow) to result in collisions every few runs. We're also "in queue" for Brent to look at the ginkgo tests more, so that may also bear fruits. |
|
(converted to draft to prevent accidental merge) |
Compare #1065. |
Closing this since it's not been touched in a long time. Re-open if it's still a possibility. |
Possible fix for containers/podman#12155
The golang docs for
math/rand
specifically mention thatrand.Seed()
should never ever be used in conjunction with any other rand method.
Fix this with a simple/local mutex to protect critical code sections.
This could be made more safe by exposing the mutex to downstream callers.
This is left up to a future commit as/if needed.
Also, in entropy-exhaustion situations it's possible for multiple
concurrent processes to obtain the same fallback seed value, where the
lock will not provide any protection. Clashes here are especially bad
given the large number of downstream users of
GenerateNonCryptoID()
.Since the Linux kernel guarantees process ID uniqueness within a
reasonable timespan, include this value into the fallback seed (along
with the time).
This is not a perfect solution, it is still possible for two processes
to generate the same fallback seed value, given extremely unlucky
timing. However, this is an improvement versus simplistic reliance
on the clock.