-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix the bug that _DataLoaderIterMultiProcess use time to generate the seed #43318
fix the bug that _DataLoaderIterMultiProcess use time to generate the seed #43318
Conversation
✅ This PR's description meets the template requirements! |
你的PR提交成功,感谢你对开源项目的贡献! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
… seed (PaddlePaddle#43318) * fix the bug that _DataLoaderIterMultiProcess use time to generate the seed * use np.random.randint to generate a base seed
PR types
Bug fixesPR changes
APIsDescribe
fix the bug that DataLoaderIterMultiProcess use the time to generate seed背景
在 #33310 中dataloader在多进程数据读取的场景下,会使用系统时间去生成随机种子。该PR是为了避免每个进程,以及每个epoch产生相同的随机数。
但使用系统时间生成随机种子,并且重置了numpy的随机种子,会导致在模型中即使固定numpy、random和paddle的随机种子后,训练结果依然无法复现。因为每一次启动训练,系统时间都不一样,原始的随机种子已经被dataloader重置。
PR效果
本PR在多进程数据读取中使用randint生成base_seed,替代使用系统时间生成base_seed的方式,可以解决 #33310 中提到的问题,并且在固定种子后多次运行,相同数据在经过dataloader和预处理(随机裁切、翻转等具有随机性的操作)后,可以得到稳定复现的输出。
修复后输出
修复前的输出