Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Socket unexpectedly shared by multiple rootless tests #12155

Closed
cevich opened this issue Nov 1, 2021 · 40 comments · Fixed by #12168
Closed

Socket unexpectedly shared by multiple rootless tests #12155

cevich opened this issue Nov 1, 2021 · 40 comments · Fixed by #12168
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@cevich
Copy link
Member

cevich commented Nov 1, 2021

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

On an assumed unrelated PR, several 'remote' tests ran on an Ubuntu 21.04 VM and clashed over the same /var/run/podman/podman-<ID> which should never ever happen.

Steps to reproduce the issue:

  1. Execute CI testing

  2. If failure doesn't occur, go back to step 1

Describe the results you received:

The podman create pod with doubled name test during setup, tried to create a podman service with the same socket as used by the previous test (podman create pod with name) [Annotated Log]

Error: unable to create socket: listen unix /run/podman/podman-67f86292cce183cb624e0205c8bb8361ec130c3aabb11be86a20987e15b234c8.sock: bind: address already in use

Searching through the log, there are multiple subsequent examples of this occurring as well.

Describe the results you expected:

Despite running in parallel, remote tests should never (NEVER) use the same service socket name (unless relevant to the specific test)

Additional information you deem important (e.g. issue happens only occasionally):

Tracing up the code-path from uuid := stringid.GenerateNonCryptoID(), into containers/common stringid package, if the system lacks entropy we base the random seed on the current time. Therefor if two tests happen to start at the exact same time, they could obtain clashing IDs. Note: This function is used liberally, so the ID clash needn't occur in a socket name.

Output of podman version:

4.0.0-dev built from source (553c0bd2a2a12739702cc408c3be6b93a91e7012) PR referenced above.

Output of podman info --debug:

See test log in description

Package info (e.g. output of rpm -q podman or apt list podman):

See test log in description

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/master/troubleshooting.md)

Yes

Additional environment details (AWS, VirtualBox, physical, etc.):

GCP VM ubuntu-c6431352024203264

@openshift-ci openshift-ci bot added the kind/bug Categorizes issue or PR as related to a bug. label Nov 1, 2021
@cevich
Copy link
Member Author

cevich commented Nov 1, 2021

cc @cdoern @baude @mheon @edsantiago

@Luap99
Copy link
Member

Luap99 commented Nov 1, 2021

Maybe just switch to GenerateRandomID()

@cevich
Copy link
Member Author

cevich commented Nov 1, 2021

So it's not a very "pretty" fix, but one simple thing that could be done is to check if there is an existing socket file right before p.RemoteSocket = fmt.Sprintf("unix:/run/podman/podman-%s.sock", uuid). If there is one, get a new uuid or throw an test-stopping error.

We "should" be running the rngd service in our VMs to keep entropy filled, but this isn't perfect so it's possible we could still run out. In this case, another "cheap" solution could be adding the current process and thread IDs to the seed = time.Now().UnixNano() call (in containers/storage). Even when entropy fails, it's less-likely the same thread will ever call for random ID at the same time (to the nanosecond).

@cevich
Copy link
Member Author

cevich commented Nov 1, 2021

Maybe just switch to GenerateRandomID()

Obtaining "real" entropy from the system can be extremely slow if the pool is low/empty.

@Luap99
Copy link
Member

Luap99 commented Nov 1, 2021

According to the math/rand package doc it is not safe for concurrent use.

Edit: Nevermind, the default rand.Read function which is used here is safe to use concurrently

@cdoern
Copy link
Contributor

cdoern commented Nov 1, 2021

@cevich this might be more of an issue with either my PR or the testing suite in general because rebasing and pushing created more unrelated failures it seems: #11958

@cevich
Copy link
Member Author

cevich commented Nov 1, 2021

@cdoern It could easily be both. I looked at the latest logs and it seems there are some actual flakes mixed in. In one case, it's a flake we were waiting to reproduce for another issue I opened...so...err...thanks 😁

@baude or @mheon I'm not 100% in understanding how the remote testing works, would you mind sharing your wisdom/observations here?

@cevich
Copy link
Member Author

cevich commented Nov 1, 2021

This problem isn't just affecting Ubuntu, here's an example in Fedora-land:

Search the log for 021b9640208565a03009719b90fce72fe55cc4f4852d17ded76a27b37db86878 and you'll see there are several different remote tests all referencing this same socket. Some tests fail, others don't.

More examples in fedora 34 (log): Search for c4c89ef90b7a6106205e2a32dadc5860e5f2b59201638ee8d3a75f4c36142f97.

@cevich
Copy link
Member Author

cevich commented Nov 1, 2021

Update: I took a look at the VMs, and it seems I was mistaken. The Ubuntu images do not have rngd installed/enabled, but the Fedora VMs do. Even so, rngd isn't perfect, it's still possible the systems are running out of entropy and hitting this bug by starting tests at the exact same time (thus getting the same generated IDs).

@cdoern
Copy link
Contributor

cdoern commented Nov 2, 2021

@cevich I am not familiar with the different time complexities of GenerateNonCryptoID vs GenerateRandomID would switching to the truly random options dramatically change the outcome? It could be worth it if it solves more flakes than it causes but again I do not know the time difference.

@cevich
Copy link
Member Author

cevich commented Nov 2, 2021

It could be worth it if it solves more flakes than it causes but again I do not know the time difference.

Looking at the code, it first tries to get a "true" random number (backed by the kernel's entropy pool), if that fails it gets a number based on the current system time (in nanoseconds when supported). So I think what's happening in CI is kernel-entropy is exhausted, and two (or more) tests executing in parallel obtain the same random number ID. It's a mystery to me how/why the manage to execute at the exact same time (down to the micro or nanosecond).

WRT kernel entropy, it's re-filled by hardware random-number-generators, and other "random" events on the system, like disk-access, network latency, and similar. However in VMs, we don't have access to a hardware RNG, so must rely on disk/network entropy sources which are pretty slow. When the pool is exhausted and a process does a blocking-read for more entropy, it will 'hang' until sufficient entropy is available. The problem is, it's highly likely some-or-another process/operation is concurrently responsible for exhausting the entropy in the first place, so there is likely some competition. Thus, things can hang for a VERY long time waiting for entropy, sometimes seconds, sometimes years!

So the workaround to seed from the time (when an entropy read would block) is a sensible compromise...until it isn't.
Since tests are running 3-wide in parallel, it's (obviously) possible random IDs are requested at the exact same time while entropy is exhausted, and therefor receive the same ID. This is disastrous, not only for sockets but container-IDs and image IDs and a billion other operations which all assume their IDs will never ever clash.

There's very little I can do about increasing kernel's entropy, since we're in a VM and making use of the paravirt-RNG (hardware) driver is a QEMU host-side option (which we have no control over). The best I can do is run the 'rngd' service (which we do on Fedora) but this isn't a perfect solution (and obviously MUCH slower than a hardware RNG).

But there is _some_hope, because we know the kernel MUST use monotonically increasing IDs for both processes and threads. Since it's physically impossible for the same thread to request a random ID at the exact same time, adding the thread-ID into the fall-back time-based random ID will solve this problem. It's not perfect, there are still clashes possible where the combination of time + thread ID happen to cause a clash. However it is an improvement in an overall bad situation.

@cevich
Copy link
Member Author

cevich commented Nov 2, 2021

cevich added a commit to cevich/storage that referenced this issue Nov 2, 2021
Possible fix for containers/podman#12155

The golang docs for `math/rand` specifically mention that `rand.Seed()`
should never ever be used in conjunction with any other rand method.
Fix this with a simple/local mutex to protect critical code sections.
This could be made more safe by exposing the mutex to downstream callers.
This is left up to a future commit as/if needed.

Also, in entropy-exhaustion situations it's possible for multiple
concurrent *processes* to obtain the same fallback seed value, where the
lock will not provide any protection.  Clashes here are especially bad
given the large number of downstream users of `GenerateNonCryptoID()`.

Since the Linux kernel guarantees process ID uniqueness within a
reasonable timespan, include this value into the fallback seed (along
with the time).  Note, the thread ID == process ID when there is only
a single thread.  However if there are multiple threads, the thread
ID will provide additional entropy over the process ID.

This is *not* a perfect solution, it is still possible for two processes
to generate the same fallback seed value, given extremely unlucky
timing.  However, this is an improvement versus simplistic reliance
on the clock.

Signed-off-by: Chris Evich <[email protected]>
cevich added a commit to cevich/storage that referenced this issue Nov 2, 2021
Possible fix for containers/podman#12155

The golang docs for `math/rand` specifically mention that `rand.Seed()`
should never ever be used in conjunction with any other rand method.
Fix this with a simple/local mutex to protect critical code sections.
This could be made more safe by exposing the mutex to downstream callers.
This is left up to a future commit as/if needed.

Also, in entropy-exhaustion situations it's possible for multiple
concurrent *processes* to obtain the same fallback seed value, where the
lock will not provide any protection.  Clashes here are especially bad
given the large number of downstream users of `GenerateNonCryptoID()`.

Since the Linux kernel guarantees process ID uniqueness within a
reasonable timespan, include this value into the fallback seed (along
with the time).

This is *not* a perfect solution, it is still possible for two processes
to generate the same fallback seed value, given extremely unlucky
timing.  However, this is an improvement versus simplistic reliance
on the clock.

Signed-off-by: Chris Evich <[email protected]>
@cevich
Copy link
Member Author

cevich commented Nov 2, 2021

So @mtrmac doesn't like my fix-the-seed PR (justifiably and understandably). He also pointed out there are inherent socket-creation races along with a missing-need (in the tests) to properly assert ID collection-avoidance. At this point I think what's really missing is an understanding how this issue is even possible (are the logs lying?)

cevich added a commit to cevich/storage that referenced this issue Nov 2, 2021
Possible fix for containers/podman#12155

The golang docs for `math/rand` specifically mention that `rand.Seed()`
should never ever be used in conjunction with any other rand method.
Fix this with a simple/local mutex to protect critical code sections.
This could be made more safe by exposing the mutex to downstream callers.
This is left up to a future commit as/if needed.

Also, in entropy-exhaustion situations it's possible for multiple
concurrent *processes* to obtain the same fallback seed value, where the
lock will not provide any protection.  Clashes here are especially bad
given the large number of downstream users of `GenerateNonCryptoID()`.

Since the Linux kernel guarantees process ID uniqueness within a
reasonable timespan, include this value into the fallback seed (along
with the time).

This is *not* a perfect solution, it is still possible for two processes
to generate the same fallback seed value, given extremely unlucky
timing.  However, this is an improvement versus simplistic reliance
on the clock.

Signed-off-by: Chris Evich <[email protected]>
@mtrmac
Copy link
Collaborator

mtrmac commented Nov 3, 2021

#12168 should make sure we never accept randomly-generated ID collisions (so if we still see collisions afterwards, it’s something about how the RemoteSocket value is used/propagated later), and add a bit more logging if the collisions do come from the randomness mechanism.

@mtrmac
Copy link
Collaborator

mtrmac commented Nov 3, 2021

I have another hypothesis: See all of this, and weep:

rand.Seed(time.Now().UnixNano())

rand.Seed(time.Now().UnixNano())

rand.Seed(GinkgoRandomSeed())


rand.Seed(time.Now().UnixNano())

Which one wins? No idea. (I just have to conclude that providing process-wide singleton RNG with a publicly-accessible Seed() function is bad.)

Now, if the GinkgoRandomSeed() one wins (which can probably be determined by adding logging to all of these and running a test binary): tracing ginkgo/config.GinkgoConfig.RandomSeed, it is initialized from the time (i.e. random enough) at the top level, but every child process created by the Ginkgo machinery (in particular runAndStreamParallelGinkgoSuite) is passed that one RandomSeed value. I.e. all child processes would start with the same seed, and assuming their pattern of calls to the rand RNG is deterministic, would end up with the same RNG output.

⚠️ I didn’t test this.

Even without that, I think that every caller of rand.Seed should be modified to create its own rand.Reader with whatever properties it needs, and keep well away from assuming anything about the shared one. (That might be non-trivial, I can easily imagine that various callers of math/rand assume that the global RNG is non-deterministic just because some other package far away seeds it in a non-deterministic way, so removing all of those rand.Seed calls could break some completely unrelated code.)

@cevich
Copy link
Member Author

cevich commented Nov 3, 2021

Oof! What a good find! And `GinkgoRandomSeed()' is particularly aweful because IIRC there's even CLI option to override it to be the same value every time 😭

so removing all of those rand.Seed calls could break some completely unrelated code.

Agreed, I was thinking this too. Would it be hard to wrap all the calls such that the underlying implementation only sets the seed exactly once and ignores all subsequent attempts?

Still, this is a great find and cleaning up all these blunders will undoubtedly improve our overall well-being 😁

@cevich
Copy link
Member Author

cevich commented Nov 3, 2021

Ref: ginkgo -seed int

The seed used to randomize the spec suite. (default 1635965228)

(I'm assuming this is GinkgoRandomSeed)

@mtrmac
Copy link
Collaborator

mtrmac commented Nov 3, 2021

Run that again, and the default will change with UnixNano. But AFAICS all child processes of a parallel run share that value.


Would it be hard to wrap all the calls such that the underlying implementation only sets the seed exactly once and ignores all subsequent attempts?

We can’t reasonably patch math/rand in the standard library.

One obviously safe approach is to keep the current rand.Seed calls, but also create completely independent rand.Reader objects and only use the latter everywhere.

And then, later, if we ever audit all of the rand callers (which we should do not just for sharing the state, but for handling collisions), we can remove those seed calls.


Actually there are only a very small number of direct math/rand callers in Podman (ignoring tests, only 1 in Podman proper, 1 in Buildah, 2 in c/storage), and many are in dependencies, somewhere deep enough that we probably won’t need to update them (but we should still read them). OTOH there are many more indirect callers, e.g. of utils like GenerateNonCryptoID.

@rhatdan
Copy link
Member

rhatdan commented Nov 3, 2021

Should we remove the extra rand.Seed within kube.go?

@mtrmac
Copy link
Collaborator

mtrmac commented Nov 3, 2021

Just removing it might break things.

@rhatdan
Copy link
Member

rhatdan commented Nov 3, 2021

Well we have CI Tests for this and as long as the Seed has been called in storage, we should be all set. Another option would to be to create a storage function to seed once, and replace this call with a call to storage seed.

@mtrmac
Copy link
Collaborator

mtrmac commented Nov 3, 2021

No, I’m saying that the kube.go code linked above explicitly says it needs a seed different from the deterministic one from the standard library. Given that, it makes no sense to just drop that initialization and have it silently rely on someone else initializing that same global RNG; instead, that code should pay its own way, and create its own RNG that is sufficiently initialized without affecting everyone else in the process. And so on for everyone.

@cevich
Copy link
Member Author

cevich commented Nov 4, 2021

Since we cannot control what vendor code does, It almost sounds as if we need to independently maintain some kind of local RNG state, recovering before and storing after every random-number request. Understandably that adds a LOT of complexity, but it seems like otherwise callers can never trust the state is non-deterministic 😞

@cevich
Copy link
Member Author

cevich commented Nov 4, 2021

On second thought...the golang community is pretty large...maybe somebody else has already solved this problem?

@mtrmac
Copy link
Collaborator

mtrmac commented Nov 4, 2021

Since we cannot control what vendor code does, It almost sounds as if we need to independently maintain some kind of local RNG state

Yes

recovering before and storing after every random-number request.

that’s neither necessary nor quite possible: just keep a private rand.Reader object for each unique use case, like

var portRandomizer = rand.New(rand.NewSource(time.Now().UnixNano()))
(or, in the somewhat wasteful extreme, a single-use Reader like
var randBool = rand.New(rand.NewSource(time.Now().Unix())).Intn(2) == 0
)

@cevich
Copy link
Member Author

cevich commented Nov 4, 2021

Great. Though I still want us to take a step back, and look at the tests more deeply to confirm we're simply not leaking data: Your changes in #12168 when paired with the code that originally exposed the collisions (#11958) in my test PR allow the remote tests to pass. However, searching the logs shows we still have collisions (based on the new log output). I searched for duplicate seeds and timestamp, but found none:

@mtrmac
Copy link
Collaborator

mtrmac commented Nov 4, 2021

Ahh okay, so that explains why they're almost the same, but not quite. Hmmm. Well damn, I'm not sure where to go from here. Is it safe to argue this problem only (likely) affects our tests and not any of the other callers to GenerateNonCryptoID here (or in buildah or skopeo or ...)?

My fight with bin/ginkgo to produce a full unedited output of the log was unsuccessful (I’m sure there is a trick, but I don’t know what it is), but I did end up guessing this:

% go test -v -v ./test/e2e --ginkgo.seed=42 --ginkgo.parallel.node=1 --ginkgo.parallel.total=5
time="2021-11-04T21:24:04+01:00" level=warning msg="rand.Seed @ test/utils/utils.go:init 1636057444"
=== RUN   TestLibpod
Running Suite: Libpod Suite
===========================
Random Seed: 42
Parallel test node 1/5.

So that’s interesting:

  • We now have something indicative to the extent that test/utils/utils.go is probably the only RNG initialization that happens
  • … and that the code as written is intended to have the same value across all parallel nodes
  • but, that’s not what actually happens, because the seed call is in an init() section, apparently before the --ginkgo.seed flag was actually processed
  • What does happen, though, is that the RandomSeed value was initialized in one of the other places (I didn’t check which one), all the fallbacks are the same, to time.Now().Unix(). That can be confirmed using date -d '@1636057444'.
  • So, now we know that the RNG for the test runners is initialized using a time source, with a one-second granularity. That is large enough that collisions are not guaranteed but highly likely.

I.e. I think the above is an explanation of how the tests fail; and #12168 is a good fix for that.


That is to say, the WORST possible thing is a pervasive problem and collisions could happen anywhere. We mostly just don't notice things like an image ID colliding with a container ID (nor do we care).

Another thing we now know, from the logs, is that the stringid’s initialization (using /dev/urandom?) is what matters for Podman runs (at least until that kube.go code is triggered). So this one-second granularity seed is only an issue for the tests, AFAICS, everyone else uses .UnixNano. I.e., the high-risk flakes are not a problem for Podman.

Still, I think it would be healthy for all Podman callers to GenerateNonCryptoID that don’t already do so, to check for conflicts and retry. (I can see 5 occurrences.)

And, possibly, for all the shared users of math/rand that actually make specific requirements/assumptions, to maintain their own RNG state.

@cevich
Copy link
Member Author

cevich commented Nov 5, 2021

@mtrmac thanks so much for all the experimenting and deep analysis. So it sounds like we need two additional issues:

  1. Find/fix "all Podman callers to GenerateNonCryptoID...check for conflicts and retry."
  2. Implement local RNG state for "all the shared users of math/rand"

I'm happy to open those issues (assuming you agree it's a good idea). As for #12168, I'll make some comments in the diff so we can at least move that forward for now.

@mtrmac
Copy link
Collaborator

mtrmac commented Nov 5, 2021

  • We now have something indicative to the extent that test/utils/utils.go is probably the only RNG initialization that happens

On second thought, that shouldn’t be possible; test/e2e depends on stringid, so we should have seen logs of the stringid initialization, in some order. So, until that it explained, the Unix()-one second granularity is a guess with no reliable evidence towards it being the cause.

(I’m not sure I’ll be able to dig into that further this week.)


So it sounds like we need two additional issues:

  1. Find/fix "all Podman callers to GenerateNonCryptoID...check for conflicts and retry."
  2. Implement local RNG state for "all the shared users of math/rand"

I'm happy to open those issues (assuming you agree it's a good idea).

I think the above are a good idea regardless of what we find as the cause of the test socket collisions — but I wouldn’t argue there are the top top top priority.

@cevich
Copy link
Member Author

cevich commented Nov 8, 2021

we should have seen logs of the stringid initialization

Yeah, that's most definitely strange. Perhaps there's some clash with logrus initialization, and maybe a fmt.Printf() would work better?

@cevich
Copy link
Member Author

cevich commented Nov 8, 2021

As an experiment, I tried using fmt.Printf() in vendor/github.com/containers/storage/pkg/stringid/stringid.go init instead of logrus. When the podman command runs, the log line is printed immediately. However, (despite the tests breaking horribly) I still don't see the expected output at the beginning when the ginkgo binary executes. So something deeper and more sinister is happening with Ginkgo.

In any case, once remove all the extra debugging/logs, I think we should move forward with these changes since they seem to address the primary concern (socket collisions).

@cevich
Copy link
Member Author

cevich commented Nov 17, 2021

@mtrmac any updates on this? What's needed to move forward?

@mtrmac
Copy link
Collaborator

mtrmac commented Nov 17, 2021

  • We now have something indicative to the extent that test/utils/utils.go is probably the only RNG initialization that happens

On second thought, that shouldn’t be possible; test/e2e depends on stringid, so we should have seen logs of the stringid initialization, in some order.

Doh. go test -mod=vendor makes that show up:

% go test -mod=vendor -v -v ./test/e2e --ginkgo.seed=42 --ginkgo.parallel.node=1 --ginkgo.parallel.total=5
time="2021-11-17T22:25:41+01:00" level=warning msg="rand.Seed @ test/utils/utils.go:init 1637184341"
time="2021-11-17T22:25:41+01:00" level=warning msg="rand.Seed @ vendor/github.com/containers/storage/pkg/stringid/stringid.go:init 1747684635463093460"
=== RUN   TestLibpod

OTOH that disproves the test/utils/utils.go variant, and the one-second-granularity, as being the primary cause.

But then again, https://cs.opensource.google/go/go/+/refs/tags/go1.17.3:src/math/rand/rng.go;l=208-211;drc=refs%2Ftags%2Fgo1.17.3 ; and that’s actually documented in https://pkg.go.dev/math/rand#Seed :

Seed values that have the same remainder when divided by 2³¹-1 generate the same pseudo-random sequence.

despite the int64 type.

So, *shrug*, I don’t know how the seeds end up being the same but at 2^-31 probability floor I’m willing to live with this just being bad luck or something.

@mtrmac mtrmac linked a pull request Nov 17, 2021 that will close this issue
@cevich
Copy link
Member Author

cevich commented Nov 18, 2021

rand.Seed @ test/utils/utils.go:init 1637184341"

IMHO, this is the really bad one. I think that ginkgo seed is the same value across all the threads. So that's just asking for a module-initialization-order change to give us a bad day. We should not use the ginkgo seed anywhere in our code.

@mtrmac
Copy link
Collaborator

mtrmac commented Nov 19, 2021

I think that ginkgo seed is the same value across all the threads.

Per ginkgo -debug -dryRun -nodes=5, that’s indeed the case (or at least the 5 per-process logs all list the same seed).

So that's just asking for a module-initialization-order change to give us a bad day.

OTOH, I could argue that making collisions more likely would help make the code more robust against collisions, which we should do anyway. It’s almost tempting to add a universal BeforeEach(func() { rand.Seed(42) }). Almost — I’m not going to actually do that.

@mtrmac
Copy link
Collaborator

mtrmac commented Nov 19, 2021

Where I think we’re at:

I’m personally more a bit more worried about the unclear math/rng seeding/state, which I think we now have fixes throughout. For collisions, the 256-bit ID space just might be good enough, now that the likelihood of seeding it from the crypto RNG is higher? (OTOH the size of the RNG seed is only 31 bits.)

@cevich
Copy link
Member Author

cevich commented Nov 19, 2021

Great work @mtrmac and thanks for the summary.

(All IMHO)

I think there are some very rare and special cases where deterministic RNGs are required|useful. One example is the randomized test-order, which can be forced as needed by ginkgo ... -seed 42. Otherwise I think having them in test|user code is a slippery slope. It's nearly always best they're made non-deterministic, since that's the general developer expectation. Otherwise, (esp. for user code) if deterministic randoms are needed, they also need a very well commented and explicit usage soas to not "cloud the waters" of any non-deterministic usage nearby.

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Dec 20, 2021

@mtrmac @cevich What is going on with this issue?

@mtrmac
Copy link
Collaborator

mtrmac commented Dec 23, 2021

At the very least #12168 needs reviewing/merging, to fix the test flakes we encounter from time to time.

As for actually making the whole codebase robust against RNG collisions, #12155 (comment) overall seems pretty hard.

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 21, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants