[rllib] mountaincarcontinous-ddpg regression #5604

kifarid · 2019-08-31T11:04:54Z

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04
Ray version: 0.7.3
Python version: 3.7.4
Exact command to reproduce: rllib train -f mountaincarcontinuous-ddpg.yaml
this is the tuned example here

project/ray/blob/747daff2cb73deae7b8a6755e70e550476c09d71/rllib/tuned_examples/mountaincarcontinuous-ddpg.yaml#L1

Describe the problem

Running tuned example cause an error, i think it would be in the target update frequency or the handling of multistep returns

Error Trace back

Traceback (most recent call last):
File "/home/karimy/tensorflow/venv/lib/python3.5/site-packages/ray/tune/trial_runner.py", line 498, in _process_trial
result = self.trial_executor.fetch_result(trial)
File "/home/karimy/tensorflow/venv/lib/python3.5/site-packages/ray/tune/ray_trial_executor.py", line 347, in fetch_result
result = ray.get(trial_future[0])
File "/home/karimy/tensorflow/venv/lib/python3.5/site-packages/ray/worker.py", line 2332, in get
raise value
ray.exceptions.RayTaskError: �[36mray_DDPG:train()�[39m (pid=5200, host=karimy)
File "/home/karimy/tensorflow/venv/lib/python3.5/site-packages/ray/rllib/agents/trainer.py", line 402, in train
raise e
File "/home/karimy/tensorflow/venv/lib/python3.5/site-packages/ray/rllib/agents/trainer.py", line 388, in train
result = Trainable.train(self)
File "/home/karimy/tensorflow/venv/lib/python3.5/site-packages/ray/tune/trainable.py", line 171, in train
result = self._train()
File "/home/karimy/tensorflow/venv/lib/python3.5/site-packages/ray/rllib/agents/trainer_template.py", line 126, in _train
fetches = self.optimizer.step()
File "/home/karimy/tensorflow/venv/lib/python3.5/site-packages/ray/rllib/optimizers/sync_replay_optimizer.py", line 123, in step
batch = self.workers.local_worker().sample()
File "/home/karimy/tensorflow/venv/lib/python3.5/site-packages/ray/rllib/evaluation/rollout_worker.py", line 467, in sample
batches = [self.input_reader.next()]
File "/home/karimy/tensorflow/venv/lib/python3.5/site-packages/ray/rllib/evaluation/sampler.py", line 56, in next
batches = [self.get_data()]
File "/home/karimy/tensorflow/venv/lib/python3.5/site-packages/ray/rllib/evaluation/sampler.py", line 99, in get_data
item = next(self.rollout_provider)
File "/home/karimy/tensorflow/venv/lib/python3.5/site-packages/ray/rllib/evaluation/sampler.py", line 340, in _env_runner
base_env.send_actions(actions_to_send)
File "/home/karimy/tensorflow/venv/lib/python3.5/site-packages/ray/rllib/env/base_env.py", line 332, in send_actions
self.vector_env.vector_step(action_vector)
File "/home/karimy/tensorflow/venv/lib/python3.5/site-packages/ray/rllib/env/vector_env.py", line 114, in vector_step
r, type(r)))
ValueError: Reward should be finite scalar, got nan (<class 'float'>)

ericl · 2019-09-01T05:33:01Z

What TensorFlow version is this? It works on 1.14

kifarid · 2019-09-02T07:13:09Z

it's on 1.14 actually, but this error doesn't happen consistently

ericl · 2019-09-02T08:17:01Z

Ok, this seems to be an issue on master as well.

ericl · 2019-09-03T03:34:35Z

This seems to be caused by the recent DDPG refactoring: #5242

ninafiona · 2020-01-27T22:42:12Z

I am getting the same error with the pendulum-v0 example. Does this bug still persist?

devinbarry · 2020-03-08T19:10:27Z

Getting this same issue with IMPALA on MountainCarContinuous-v0

ericl · 2020-03-08T19:12:28Z

Please file a new bug with reproduction script instead of commenting on old issues. There are many root causes that can lead to the same error message.

ericl changed the title ~~running tuned examples~~ [rllib] mountaincarcontinous-ddpg regression Sep 2, 2019

ericl added the regression label Sep 2, 2019

ericl self-assigned this Sep 2, 2019

ericl mentioned this issue Sep 3, 2019

[rllib] Revert [rllib] Port DDPG to the build_tf_policy pattern #5626

Merged

pcmoritz closed this as completed in #5626 Sep 5, 2019

ericl reopened this Sep 5, 2019

ericl closed this as completed Sep 5, 2019

richardliaw added the rllib label Mar 25, 2020

devinbarry mentioned this issue Apr 1, 2020

[rllib] ValueError: Reward should be finite scalar, got nan (<class 'float'>) #7853

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[rllib] mountaincarcontinous-ddpg regression #5604

[rllib] mountaincarcontinous-ddpg regression #5604

kifarid commented Aug 31, 2019

ericl commented Sep 1, 2019

kifarid commented Sep 2, 2019

ericl commented Sep 2, 2019

ericl commented Sep 3, 2019

ninafiona commented Jan 27, 2020

devinbarry commented Mar 8, 2020

ericl commented Mar 8, 2020

[rllib] mountaincarcontinous-ddpg regression #5604

[rllib] mountaincarcontinous-ddpg regression #5604

Comments

kifarid commented Aug 31, 2019

System information

Describe the problem

Error Trace back

ericl commented Sep 1, 2019

kifarid commented Sep 2, 2019

ericl commented Sep 2, 2019

ericl commented Sep 3, 2019

ninafiona commented Jan 27, 2020

devinbarry commented Mar 8, 2020

ericl commented Mar 8, 2020