The first selfplay worker uses the same seed for all parallel environments #27

rPortelas · 2022-05-25T15:43:56Z

I might have found an unexpected behavior in how parallel training environments are being seeded.

I am referring to this line:

Line 112 in c533ebf

    
           envs = [self.config.new_game(self.config.seed + self.rank * i) for i in range(env_nums)]

Because the rank of the first selfplay worker is 0, parallel environments are being initialized with the same seed, which might reduce training data diversity.

We could go for a simple fix like replacing self.rank by (self.rank + 1), however this is still problematic if considering multiple workers, as there will be seed overlap between them anyway.

A good option might be to sample a seed for each parallel environment using numpy (which is seeded before launching data workers). For instance:

envs = [self.config.new_game(np.random.randint(10**9)) for i in range(env_nums)]

The text was updated successfully, but these errors were encountered:

jamesliu · 2022-05-25T16:00:06Z

Ditto, but using randint may cause irreproducible.

rPortelas · 2022-05-25T17:35:25Z

Hmm right right. Thanks for the input.

Then we could use a dedicated random state created from the original seed:

rnd_state = np.random.RandomState(self.config.seed + self.rank)
envs = [self.config.new_game(rnd_state.randint(10**9)) for _ in range(env_nums)]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The first selfplay worker uses the same seed for all parallel environments #27

The first selfplay worker uses the same seed for all parallel environments #27

rPortelas commented May 25, 2022

jamesliu commented May 25, 2022

rPortelas commented May 25, 2022 •

edited

Loading

The first selfplay worker uses the same seed for all parallel environments #27

The first selfplay worker uses the same seed for all parallel environments #27

Comments

rPortelas commented May 25, 2022

jamesliu commented May 25, 2022

rPortelas commented May 25, 2022 • edited Loading

rPortelas commented May 25, 2022 •

edited

Loading