Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix use of rand + improve entropy-exhaustion #1051

Closed
wants to merge 1 commit into from

Conversation

cevich
Copy link
Member

@cevich cevich commented Nov 2, 2021

Possible fix for containers/podman#12155

The golang docs for math/rand specifically mention that rand.Seed()
should never ever be used in conjunction with any other rand method.
Fix this with a simple/local mutex to protect critical code sections.
This could be made more safe by exposing the mutex to downstream callers.
This is left up to a future commit as/if needed.

Also, in entropy-exhaustion situations it's possible for multiple
concurrent processes to obtain the same fallback seed value, where the
lock will not provide any protection. Clashes here are especially bad
given the large number of downstream users of GenerateNonCryptoID().

Since the Linux kernel guarantees process ID uniqueness within a
reasonable timespan, include this value into the fallback seed (along
with the time).

This is not a perfect solution, it is still possible for two processes
to generate the same fallback seed value, given extremely unlucky
timing. However, this is an improvement versus simplistic reliance
on the clock.

@cevich cevich requested review from mtrmac and edsantiago November 2, 2021 16:59
@cevich cevich force-pushed the fix_concurrent_rand branch from 366793c to 7a0c955 Compare November 2, 2021 17:06
@cevich
Copy link
Member Author

cevich commented Nov 2, 2021

Well 💩 [+0163s] pkg/stringid/stringid.go:97:40: undefined: syscall.Gettid the package source shows this function has been there since 2009, what the heck?

Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NAK at this point; this might be right but to me it’s not obvious, and AFAICS the PR nor the issue explain exactly how this helps.


Overall I’m unhappy with the GenerateNonCryptoID concept; it seems to basically willfully ignore the trade-offs, promising a “unique” ID on top of a non-cryptographic RNG with a 64-bit seed, which is just not a big enough identifier space.

(I’m also a bit unhappy about merely an import of a package changing the seed of the global math/rand state. The code could just as well create its own random reader without polluting that. But, sure, it probably doesn’t hurt anything observable.)

Given the tradeoff as it exists, i.e. that we don’t want to rely on /dev/urandom, I think any consumer of this code must check for duplicates, and handle by retrying — as some, but not all, do. And then we don’t need to tinker with the exact likelihood of a collision, as long as it is significantly below 1, we are fine, and there will be no flakes, not just a decreased likelihood of flakes.

A duplicate check is a bit difficult with AF_UNIX sockets, because ordinarily they must be removed before Bind creates them, i.e. it isn’t possible to “pre-allocate” a socket and pass it to a service process as a path to be created, without races, but we could easily enough create empty $UUID.lock files — whoever manages to create one first, “owns” the $UUID and can use a guaranteed-unique $UUID.socket path.


Ultimately in this specific code path (for the record, https://github.com/containers/podman/blob/e63e90999c88c7e8c9193c98df962e47001adb0f/test/e2e/common_test.go#L294-L302 ), I don’t think we need to rely on randomness to avoid collisions even a little. AFAICS:

  • Each test could, in something like SynchronizedBeforeSuite, create a directory using Mkdtemp (OK, that one uses randomness, but it is known to work — because it retries on collisions)
  • Ginkgo will spawn a number of processes; each can be uniquely identified using config.GinkgoConfig.ParallelNode
  • Within that process, just have a static static per-process counter variable (protected by a mutex, to be safe)
  • Combining the three, $tempDir/$node-$perProcess would AFAICS generate a guaranteed-unique socket path (or I’m missing something and the process tree is actually more complex).

So, given the two alternatives above, which respect the limitations of GenerateNonCryptoID and are not inherently flaky, at least with what I can now see, I don’t think tweaking a random seed somewhere deeper in the call stack, with a (to me) very-hard-to-quantify difference in a likelihood of a flake, but not eliminating this, is useful.

"time"
)

const shortLen = 12

var (
randLock sync.Mutex // No math/rand method should ever be called concurrently
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This lock is unnecessary; math/rand documentation says

The default Source is safe for concurrent use by multiple goroutines, but Sources created by NewSource are not.

where math/rand.Read and math/rand.Seed refer to the global source.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think you're saying...we should not be setting a seed at all, and just rely on the default state?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this review comment I’m saying that the randLock variable doesn’t need to exist; it can just be dropped and the pre-existing non-locking can continue to be used without issues, making the rest of the PR smaller.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the thing that got me thinking we needed locks:

https://pkg.go.dev/math/rand#Rand.Read

"Read should not be called concurrently with any other Rand method."

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but that’s rand.New().Read, not rand.Read. We AFAICS use the latter one, https://pkg.go.dev/math/rand#Read .

// This is slow and imperfect, but preferable to hanging (indefinately)
// for the entropy pool to fill.
randLock.Lock()
seed = time.Now().UnixNano() + int64(syscall.Gettid())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t really see how this helps; the UnixNano seed has a 1e9 range, and this just slightly pertubes the value in that range but doesn’t change the range in any way. So, supposing the time window for two concurrent processes is, I dunno, 10 ms, and there is a 1/10e6 chance of a collision, AFAICS there should be almost exactly a 1/10e6 chance of collision after we add the TID, which is likely to be the same value plus or minus at most a few hundred. So how does this make a difference?

For this approach to a fix I’d like to see a much more detailed explanation for how this fails (if the theory is that there are two processes that independently run this code and arrive at the same seed, how is such a prima facie extremely unlikely thing happening? Do we know how the cryptoseed initialization above fails, in the first place?), probably an explanation that somehow explains how the nanosecond values are actually multiples of some large number, and adding the smaller-granularity TID value actually adds a notable number of bits to the entropy of the seed, decreasing the likelihood of failures.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is such a prima facie extremely unlikely thing happening

And happening repeatedly, on multiple OSes and multiple versions ☹️

Do we know how the cryptoseed initialization above fails

We do not, we're assuming kernel entropy is exhausted and my (bad) test-code analysis shows there is no other way for the same socket ID to clash across multiple tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh...I should add: The test VMs most certainly do not have access to the hardware RNG. We're relying on the rngd service in Fedora, and nothing but luck in Ubuntu 😭 So it's easily possible just running the podman tests is soaking up all the entropy.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m out of touch, but I thought that /dev/urandom reads don’t fail — they could block, but not fail. And the implementation linked at https://cs.opensource.google/go/go/+/refs/tags/go1.17.2:src/crypto/rand/rand_unix.go doesn’t show any use of O_NONBLOCK anyway, I think.

So, some evidence to see that we are indeed taking this code path, and what the failure is, could be interesting.

Similarly, collecting a few hundred seed values (on the target platform) and seeing whether they happen to have suspiciously non-uniform value distribution, might be informative.

(Yeah, I could run these experiments myself, and I didn’t.)


Ultimately, understanding how exactly this happens would tickle my curiosity and interest, but I just don’t think it’s productive. If we add a duplicate check+retry to the caller, we can have pretty much any weak RNG here, even with a probability of collisions in the order of 1/1000, and it would work reliably. To me that’s a much more useful change than understanding in detail, whether the probability of collisions is 1/1e9, 1/1e6, 1/1e7, 2^-32, 2^-64 or something weird, and still having, in principle, the risk of occasional failures, however exactly we tune this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn’t show any use of O_NONBLOCK anyway, I think.

It is also my understanding as well that /dev/urandom will block if there isn't enough entropy. I haven't looked at the code at all (it scares me), I'm assuming somebody did when they added the error check and time.nano() fallback.

Since you know this code-base, would it make sense at all to issue some kind of warning (to the logs) when the fallback option is chosen? That would at least increase observability of the actual underlying error.

To me that’s a much more useful change than understanding in detail

Precisely, this PR isn't about making a better 20-sided die, it's about preventing process-to-process collisions. If that is needed, and can be done centrally, it will benefit all callers of this function.

@mtrmac
Copy link
Collaborator

mtrmac commented Nov 2, 2021

Well 💩 [+0163s] pkg/stringid/stringid.go:97:40: undefined: syscall.Gettid the package source shows this function has been there since 2009, what the heck?

That’s a Linux-only system call, and the failure is a macOS cross-compilation. Getpid might work just as well in this situation (but I don’t think this PR really works, so if it does, there may well be something I’m missing about the difference between the two).

@mtrmac
Copy link
Collaborator

mtrmac commented Nov 2, 2021

However if there are multiple threads, the thread ID will provide additional entropy over the process ID.

The seeding happens in an init function. That doesn’t guarantee that the process is single-threaded at this point, but it is close enough at least in the sense, that init() will only run exactly once per process (not per thread). (Also, the Linux thread IDs and process IDs are the same namespace — a single-thread process has TID == PID; so the entropy space is the same.)

@cevich
Copy link
Member Author

cevich commented Nov 2, 2021

Thanks, yeah I really don't 100% know what I'm doing here. But I can swap tid->pid (assuming that's what you mean).

@cevich cevich force-pushed the fix_concurrent_rand branch from 7a0c955 to 8f515b4 Compare November 2, 2021 18:11
@cevich
Copy link
Member Author

cevich commented Nov 2, 2021

Overall I’m unhappy with the GenerateNonCryptoID concept;

If it helps (explain) at all: I think this code originally came from "another container related project" which we try to resemble.

@mtrmac
Copy link
Collaborator

mtrmac commented Nov 2, 2021

If it helps (explain) at all: I think this code originally came from "another container related project" which we try to resemble.

Yeah, and I suppose that’s fair for things like container IDs (although there probably should be duplicate checks everywhere). There isn’t that kind of tie for the remote socket paths — AFAICS we just need unique paths, and everything else (using random values, encoding as hexadecimal, using this particular implementation) are just implementation choices we don’t have to make.

@cevich
Copy link
Member Author

cevich commented Nov 2, 2021

just implementation choices we don’t have to make.

I agree with you. I just don't know ginkgo (or golang) well enough to say how the podman remote tests are failing like this at all, nor what's the best way to fix them. I was hoping maybe this would be an easy/cheap fix, but as you rightly point out, I'm just throwing noodles at the wall to see which ones stick. I've got the issue on Brent's radar, and he's said he'll take a look as soon as he can, so I guess we just "hurry up and wait".

Possible fix for containers/podman#12155

The golang docs for `math/rand` specifically mention that `rand.Seed()`
should never ever be used in conjunction with any other rand method.
Fix this with a simple/local mutex to protect critical code sections.
This could be made more safe by exposing the mutex to downstream callers.
This is left up to a future commit as/if needed.

Also, in entropy-exhaustion situations it's possible for multiple
concurrent *processes* to obtain the same fallback seed value, where the
lock will not provide any protection.  Clashes here are especially bad
given the large number of downstream users of `GenerateNonCryptoID()`.

Since the Linux kernel guarantees process ID uniqueness within a
reasonable timespan, include this value into the fallback seed (along
with the time).

This is *not* a perfect solution, it is still possible for two processes
to generate the same fallback seed value, given extremely unlucky
timing.  However, this is an improvement versus simplistic reliance
on the clock.

Signed-off-by: Chris Evich <[email protected]>
@cevich cevich force-pushed the fix_concurrent_rand branch from 8f515b4 to 8d0afbb Compare November 2, 2021 18:38
@edsantiago
Copy link
Member

I'm coming into this late, and the whole thread is overwhelming.

AIUI the problem is that common_test.go:PodmanTestCreateUtil() creates a socket file using a collisionful algorithm. Is that correct? If so: does Go provide a way to get call stack, and if so, could p.RemoteSocket be modified to include the calling test function name or even just program counter?

@cevich
Copy link
Member Author

cevich commented Nov 3, 2021

AIUI the problem is that common_test.go:PodmanTestCreateUtil() creates a socket file using a collisionful algorithm. Is that correct?

That's the theory, surrounded by other theories, based on my (probably bad) code analysis, based on seeing the same GenerateNonCryptoID() ID value generated across multiple tests (running in parallel).

That's why we need to wait on this PR for a deeper analysis of containers/podman#12155 This could easily be an issue relating to some esoteric Ginkgo thing or the way our tests are written: Case-in-point - I haven't observed (for example) a socket ID clashing with a container ID. Always socket ID to socket ID.

(Though I appreciate your attention and suggestion)

@mtrmac
Copy link
Collaborator

mtrmac commented Nov 3, 2021

After writing so many words, let me write some code to avoid the collisions… if we still see collisions afterwards, we’ll know that the ID generation is innocent and it is some reuse within the test infrastructure.

@cevich
Copy link
Member Author

cevich commented Nov 3, 2021

if we still see collisions afterwards

I do have a commit in a PR that seems (somehow) to result in collisions every few runs.

We're also "in queue" for Brent to look at the ginkgo tests more, so that may also bear fruits.

@mtrmac
Copy link
Collaborator

mtrmac commented Nov 3, 2021

After writing so many words, let me write some code to avoid the collisions…

containers/podman#12168

@cevich cevich marked this pull request as draft November 3, 2021 15:58
@cevich
Copy link
Member Author

cevich commented Nov 3, 2021

(converted to draft to prevent accidental merge)

@mtrmac
Copy link
Collaborator

mtrmac commented Nov 19, 2021

Compare #1065.

@cevich
Copy link
Member Author

cevich commented Mar 29, 2022

Closing this since it's not been touched in a long time. Re-open if it's still a possibility.

@cevich cevich closed this Mar 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants