-
Notifications
You must be signed in to change notification settings - Fork 388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature(xjx): new style dist version, add storage loader and model loader #425
Conversation
Codecov Report
@@ Coverage Diff @@
## dev-dist #425 +/- ##
===========================================
Coverage ? 85.50%
===========================================
Files ? 542
Lines ? 43009
Branches ? 0
===========================================
Hits ? 36776
Misses ? 6233
Partials ? 0
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
* demo(nyz): add naive dp demo * demo(nyz): add naive ddp demo * feature(nyz): add naive tb_logger in new evaluator * Add singleton log writer * Use get_instance on writer * feature(nyz): add general logger middleware * feature(nyz): add soft update in DQN target network * fix(nyz): fix termination env_step bug and eval task.finish broadcast bug * Support distributed dqn * Add more desc (ci skip) * Support distributed dqn Add more desc (ci skip) Add timeout on model exchanger * feature(nyz): add online logger freq * fix(nyz): fix policy set device bug * add offline rl logger * change a bit * add else in checking ctx type * add test_logger.py * add mock of offline_logger * add mock of online writer * reformat * reformat * feature(nyz): polish atari ddp demo and add dist demo * fix(nyz): fix mq listen bug when stop * demo(nyz): add atari ppo(sm+ddp) demo * demo(nyz): add ppo ddp avgsplit demo * demo(nyz): add ditask + pytorch ddp demo * fix(nyz): fix dict-type obs bugs * fix(nyz): fix get_shape0 bug when nested structure * Route finish event to all processes in the cluster * demo(nyz): add naive dp demo * demo(nyz): add naive ddp demo * feature(nyz): add naive tb_logger in new evaluator * feature(nyz): add soft update in DQN target network * fix(nyz): fix termination env_step bug and eval task.finish broadcast bug * Add singleton log writer * Use get_instance on writer * feature(nyz): add general logger middleware * Support distributed dqn * Add more desc (ci skip) * Support distributed dqn Add more desc (ci skip) Add timeout on model exchanger * feature(nyz): add online logger freq * fix(nyz): fix policy set device bug * add offline rl logger * change a bit * add else in checking ctx type * add test_logger.py * add mock of offline_logger * add mock of online writer * reformat * reformat * feature(nyz): polish atari ddp demo and add dist demo * fix(nyz): fix mq listen bug when stop * demo(nyz): add atari ppo(sm+ddp) demo * demo(nyz): add ppo ddp avgsplit demo * demo(nyz): add ditask + pytorch ddp demo * fix(nyz): fix dict-type obs bugs * fix(nyz): fix get_shape0 bug when nested structure * Route finish event to all processes in the cluster * refactor(nyz): split dist ddp demo implementation * feature(nyz): add rdma test demo(ci skip) * feature(xjx): new style dist version, add storage loader and model loader (#425) * Add singleton log writer * Use get_instance on writer * feature(nyz): polish atari ddp demo and add dist demo * Refactor dist version * Wrap class based middleware * Change if condition in wrapper * Only run enhancer on learner * Support new parallel mode on slurm cluster * Temp data loader * Stash commit * Init data serializer * Update dump part of code * Test StorageLoader * Turn data serializer into storage loader, add storage loader in context exchanger * Add local id and startup interval * Fix storage loader * Support treetensor * Add role on event name in context exchanger, use share_memory function on tensor * Double size buffer * Copy tensor to cpu, skip wait for context on collector and evaluator * Remove data loader middleware * Upgrade k8s parser * Add epoch timer * Dont use lb * Change tensor to numpy * Remove files when stop storage loader * Discard shared object * Ensure correct load shm memory * Add model loader * Rename model_exchanger to ModelExchanger * Add model loader benchmark * Shutdown loaders when task finish * Upgrade supervisor * Dont cleanup files when shutting down * Fix async cleanup in model loader * Check model loader on dqn * Dont use loader in dqn example * Fix style check * Fix dp * Fix github tests * Skip github ci * Fix bug in event loop * Fix enhancer tests, move router from start to __init__ * Change default ttl * Add comments Co-authored-by: niuyazhe <[email protected]> * style(nyz): correct yapf style * fix(nyz): fix ctx and logger compatibility bugs * polish(nyz): update demo from cartpole v0 to v1 * fix(nyz): fix evaluator condition bug * style(nyz): correct flake8 style * demo(nyz): move back to CartPole-v0 * fix(nyz): fix context manager env step merge bug(ci skip) * fix(nyz): fix context manager env step merge bug(ci skip) * fix(nyz): fix flake8 style Co-authored-by: Xu Jingxin <[email protected]> Co-authored-by: zhumengshen <[email protected]>
Description
Related Issue
TODO
Check List