-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fixes import from rllib #7
base: master
Are you sure you want to change the base?
Conversation
Hey, when I try running this PR to reproduce the results of the paper, I still get related issues, error log below. Given that
|
Ahh sorry, you also need to change:
I've updated the PR but can't test this right now unfortunately, so let me know if there are still problems. I was trying to help out with this issue: ray-project/ray#6476 |
I got to this PR from having the same error as in that bug actually. Thanks the help with this too. There are still errors with that fix:
|
tfp is Tensorflow Probability which I think you just need to pip install. https://www.tensorflow.org/probability Hopefully that works. Also, if you are able to benchmark this code it would be great. This project is unrelated to me, but in my brief tests I experienced some training issues when using Tensoflow 2.0. |
Thank you. I'm using tensorflow v1.14 at the moment, because stable baselines is not compatible with tf2. I get the exact same error after installing tensorflow probability. Here's all my packages for reference:
|
@justinkterry have you try to install |
That fixed it. That needs to be included in the documentation somewhere.
…On Mon, Dec 16, 2019 at 11:54 AM dragon28 ***@***.***> wrote:
@justinkterry <https://github.com/justinkterry> have you try to install
tensorflow-probability==0.7.0 which tensorflow-probability version 0.7.0?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#7?email_source=notifications&email_token=AEUF33AEJVYGXSHHAKBQX5TQY6XDHA5CNFSM4J2L2BUKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEG7LMDA#issuecomment-566146572>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEUF33GKGMHA2ENZYRL7IV3QY6XDHANCNFSM4J2L2BUA>
.
--
Thank you for your time,
Justin Terry
|
This comment has been minimized.
This comment has been minimized.
@dragon28 the real world com scenario isn't one of the currently supported MPE games (read ~line 34 of run_maddpg.py where it parses the --scenario arg). In terms of the memory error, it's an issue with ray configuration instead of this code. You have to edit the config for how much ram workers, etc. can use in ray (google it). Ray infers defaults based on the system, so my 64GB RAM workstation can run it fine without changes, but my 16GB RAM laptop can't. |
This default config on my workstation works, for reference:
|
Hey, I'm having the exact same memory issue as you very reproducibly. There appears to be a leak in the number of workers? I'll update this when I figured out what's going on more exactly. |
Hey all, this repo isn't being maintained, so I forked it, made these fixes, and the maintainers of RLlib/ray made it the new official example repo of the maddpg example code. See ray-project/ray#6831 and https://github.com/justinkterry/maddpg-rllib. |
hey, @justinkterry I got the same error as in that bug actually. Thanks the help with this too. There are still errors with that fix: 2020-06-04 15:16:12,567 ERROR trial_runner.py:519 -- Trial MADDPG_mpe_420fc_00000: Error processing event. |
You have to use my fork
…On Thu, Jun 4, 2020 at 3:17 AM Guoth ***@***.***> wrote:
hey, @justinkterry <https://github.com/justinkterry> I got the same error
as in that bug actually. Thanks the help with this too. There are still
errors with that fix:
2020-06-04 15:16:12,567 ERROR trial_runner.py:519 -- Trial
MADDPG_mpe_420fc_00000: Error processing event.
Traceback (most recent call last):
File
"/home/aics/anaconda3/envs/tf/lib/python3.6/site-packages/ray/tune/trial_runner.py",
line 467, in _process_trial
result = self.trial_executor.fetch_result(trial)
File
"/home/aics/anaconda3/envs/tf/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py",
line 430, in fetch_result
result = ray.get(trial_future[0], DEFAULT_GET_TIMEOUT)
File
"/home/aics/anaconda3/envs/tf/lib/python3.6/site-packages/ray/worker.py",
line 1474, in get
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(AttributeError): ray::MADDPG.train()
(pid=148180, ip=192.168.1.106)
File "python/ray/_raylet.pyx", line 407, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 442, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 445, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 446, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 400, in
ray._raylet.execute_task.function_executor
File
"/home/aics/anaconda3/envs/tf/lib/python3.6/site-packages/ray/rllib/agents/trainer_template.py",
line 90, in *init*
Trainer.*init*(self, config, env, logger_creator)
File
"/home/aics/anaconda3/envs/tf/lib/python3.6/site-packages/ray/rllib/agents/trainer.py",
line 452, in *init*
super().*init*(config, logger_creator)
File
"/home/aics/anaconda3/envs/tf/lib/python3.6/site-packages/ray/tune/trainable.py",
line 174, in *init*
self._setup(copy.deepcopy(self.config))
File
"/home/aics/anaconda3/envs/tf/lib/python3.6/site-packages/ray/rllib/agents/trainer.py",
line 627, in _setup
self._init(self.config, self.env_creator)
File
"/home/aics/anaconda3/envs/tf/lib/python3.6/site-packages/ray/rllib/agents/trainer_template.py",
line 115, in _init
self.config["num_workers"])
File
"/home/aics/anaconda3/envs/tf/lib/python3.6/site-packages/ray/rllib/agents/trainer.py",
line 700, in _make_workers
logdir=self.logdir)
File
"/home/aics/anaconda3/envs/tf/lib/python3.6/site-packages/ray/rllib/evaluation/worker_set.py",
line 59, in *init*
RolloutWorker, env_creator, policy, 0, self._local_config)
File
"/home/aics/anaconda3/envs/tf/lib/python3.6/site-packages/ray/rllib/evaluation/worker_set.py",
line 282, in _make_worker
extra_python_environs=extra_python_environs)
File
"/home/aics/anaconda3/envs/tf/lib/python3.6/site-packages/ray/rllib/evaluation/rollout_worker.py",
line 378, in *init*
self._build_policy_map(policy_dict, policy_config)
File
"/home/aics/anaconda3/envs/tf/lib/python3.6/site-packages/ray/rllib/evaluation/rollout_worker.py",
line 930, in _build_policy_map
policy_map[name] = cls(obs_space, act_space, merged_conf)
File
"/home/aics/anaconda3/envs/tf/lib/python3.6/site-packages/ray/rllib/contrib/maddpg/maddpg_policy.py",
line 155, in *init*
scope="actor"))
File
"/home/aics/anaconda3/envs/tf/lib/python3.6/site-packages/ray/rllib/contrib/maddpg/maddpg_policy.py",
line 371, in _build_actor_network
sampler = tfp.distributions.RelaxedOneHotCategorical(
AttributeError: 'NoneType' object has no attribute 'distributions'
Traceback (most recent call last):
File "run_maddpg.py", line 183, in
main(args)
File "run_maddpg.py", line 178, in main
}, verbose=0)
File
"/home/aics/anaconda3/envs/tf/lib/python3.6/site-packages/ray/tune/tune.py",
line 411, in run_experiments
return_trials=True)
File
"/home/aics/anaconda3/envs/tf/lib/python3.6/site-packages/ray/tune/tune.py",
line 347, in run
raise TuneError("Trials did not complete", incomplete_trials)
ray.tune.error.TuneError: ('Trials did not complete',
[MADDPG_mpe_420fc_00000])
(tf) ***@***.***:~/RayRLlib/maddpg-rllib$
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#7 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEUF33GDHJAFSUGNECOIOM3RU5DBPANCNFSM4J2L2BUA>
.
--
Thank you for your time,
Justin Terry
|
maddpg is currently in rllib.contrib rather than rllib.agents
see ray-project/ray#6476