You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I searched the issues and found no similar issues.
Ray Component
Ray Tune
What happened + What you expected to happen
Crashes when running tune.run with pytorch_lightning. Apparently, the most recent version ofpytorch_lightning.Trainer has a property called sanity_checking, but tune.integration.pytorch_lightning (line 177) is trying to access running_sanity_check.
Log:
(ImplicitFunc pid=1758) /databricks/python/lib/python3.8/site-packages/pytorch_lightning/trainer/data_loading.py:116: UserWarning: The dataloader, val_dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 16 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
(ImplicitFunc pid=1758) rank_zero_warn(
(ImplicitFunc pid=1752) /databricks/python/lib/python3.8/site-packages/torch/nn/modules/conv.py:294: UserWarning: Using padding='same' with even kernel lengths and odd dilation may require a zero-padded copy of the input be created (Triggered internally at /pytorch/aten/src/ATen/native/Convolution.cpp:660.)
(ImplicitFunc pid=1752) return F.conv1d(input, weight, bias, self.stride,
(ImplicitFunc pid=1758) /databricks/python/lib/python3.8/site-packages/torch/nn/modules/conv.py:294: UserWarning: Using padding='same' with even kernel lengths and odd dilation may require a zero-padded copy of the input be created (Triggered internally at /pytorch/aten/src/ATen/native/Convolution.cpp:660.)
(ImplicitFunc pid=1758) return F.conv1d(input, weight, bias, self.stride,
<IPython.core.display.HTML object>
(ImplicitFunc pid=1752) 2021-12-10 00:24:40,007 ERROR function_runner.py:268 -- Runner Thread raised error.
(ImplicitFunc pid=1752) Traceback (most recent call last):
(ImplicitFunc pid=1752) File "/databricks/python/lib/python3.8/site-packages/ray/tune/function_runner.py", line 262, in run
(ImplicitFunc pid=1752) self._entrypoint()
(ImplicitFunc pid=1752) File "/databricks/python/lib/python3.8/site-packages/ray/tune/function_runner.py", line 330, in entrypoint
(ImplicitFunc pid=1752) return self._trainable_func(self.config, self._status_reporter,
(ImplicitFunc pid=1752) File "/databricks/python/lib/python3.8/site-packages/ray/util/tracing/tracing_helper.py", line 451, in _resume_span
(ImplicitFunc pid=1752) return method(self, *_args, **_kwargs)
(ImplicitFunc pid=1752) File "/databricks/python/lib/python3.8/site-packages/ray/tune/function_runner.py", line 597, in _trainable_func
(ImplicitFunc pid=1752) output = fn()
(ImplicitFunc pid=1752) File "<command-2948585277627227>", line 84, in train_cnn
(ImplicitFunc pid=1752) File "/databricks/python/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 737, in fit
(ImplicitFunc pid=1752) self._call_and_handle_interrupt(
(ImplicitFunc pid=1752) File "/databricks/python/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 682, in _call_and_handle_interrupt
(ImplicitFunc pid=1752) return trainer_fn(*args, **kwargs)
(ImplicitFunc pid=1752) File "/databricks/python/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 772, in _fit_impl
(ImplicitFunc pid=1752) self._run(model, ckpt_path=ckpt_path)
(ImplicitFunc pid=1752) File "/databricks/python/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1195, in _run
(ImplicitFunc pid=1752) self._dispatch()
(ImplicitFunc pid=1752) File "/databricks/python/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1275, in _dispatch
(ImplicitFunc pid=1752) self.training_type_plugin.start_training(self)
(ImplicitFunc pid=1752) File "/databricks/python/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 202, in start_training
(ImplicitFunc pid=1752) self._results = trainer.run_stage()
(ImplicitFunc pid=1752) File "/databricks/python/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1285, in run_stage
(ImplicitFunc pid=1752) return self._run_train()
(ImplicitFunc pid=1752) File "/databricks/python/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1307, in _run_train
(ImplicitFunc pid=1752) self._run_sanity_check(self.lightning_module)
(ImplicitFunc pid=1752) File "/databricks/python/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1371, in _run_sanity_check
(ImplicitFunc pid=1752) self._evaluation_loop.run()
(ImplicitFunc pid=1752) File "/databricks/python/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 151, in run
(ImplicitFunc pid=1752) output = self.on_run_end()
(ImplicitFunc pid=1752) File "/databricks/python/lib/python3.8/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 140, in on_run_end
(ImplicitFunc pid=1752) self._on_evaluation_end()
(ImplicitFunc pid=1752) File "/databricks/python/lib/python3.8/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 202, in _on_evaluation_end
(ImplicitFunc pid=1752) self.trainer.call_hook("on_validation_end", *args, **kwargs)
(ImplicitFunc pid=1752) File "/databricks/python/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1491, in call_hook
(ImplicitFunc pid=1752) callback_fx(*args, **kwargs)
(ImplicitFunc pid=1752) File "/databricks/python/lib/python3.8/site-packages/pytorch_lightning/trainer/callback_hook.py", line 221, in on_validation_end
(ImplicitFunc pid=1752) callback.on_validation_end(self, self.lightning_module)
(ImplicitFunc pid=1752) File "/databricks/python/lib/python3.8/site-packages/ray/tune/integration/pytorch_lightning.py", line 118, in on_validation_end
(ImplicitFunc pid=1752) self._handle(trainer, pl_module)
(ImplicitFunc pid=1752) File "/databricks/python/lib/python3.8/site-packages/ray/tune/integration/pytorch_lightning.py", line 200, in _handle
(ImplicitFunc pid=1752) report_dict = self._get_report_dict(trainer, pl_module)
(ImplicitFunc pid=1752) File "/databricks/python/lib/python3.8/site-packages/ray/tune/integration/pytorch_lightning.py", line 177, in _get_report_dict
(ImplicitFunc pid=1752) if trainer.running_sanity_check:
(ImplicitFunc pid=1752) AttributeError: 'Trainer' object has no attribute 'running_sanity_check'
The text was updated successfully, but these errors were encountered:
gg-aking
added
bug
Something that is supposed to be working; but isn't
triage
Needs triage (eg: priority, bug/not-bug, and owning component)
labels
Dec 10, 2021
Search before asking
Ray Component
Ray Tune
What happened + What you expected to happen
Crashes when running
tune.run
withpytorch_lightning
. Apparently, the most recent version ofpytorch_lightning.Trainer
has a property calledsanity_checking
, buttune.integration.pytorch_lightning
(line 177) is trying to accessrunning_sanity_check
.Log:
Versions / Dependencies
pytorch-lightning==1.5.5
ray==1.9.0
torch==1.9.0+cpu
torchmetrics==0.6.1
torchvision==0.10.0
Reproduction script
Anything else
Occurs 9/10 times.
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: