Skip to content

Commit

Permalink
update pytorch DLC version to 1.11
Browse files Browse the repository at this point in the history
The notebook fails with current 1.8 pytorch. I think its a problem with the torchvision installed in the container.

```
AlgorithmError: ExecuteUserScriptError: Command "/opt/conda/bin/python3.6 mnist.py --backend gloo --epochs 1" INFO:__main__:Initialized the distributed environment: 'gloo' backend on 2 nodes. Current host rank is 0. Number of gpus: 0 INFO:__main__:Get train data loader Traceback (most recent call last): File "mnist.py", line 257, in <module> train(parser.parse_args()) File "mnist.py", line 114, in train train_loader = _get_train_data_loader(args.batch_size, args.data_dir, is_distributed, **kwargs) File "mnist.py", line 48, in _get_train_data_loader [transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))] File "/opt/conda/lib/python3.6/site-packages/torchvision/datasets/mnist.py", line 83, in __init__ ' You can use download=True to download it') RuntimeError: Dataset not found. You can use download=True to download it, exit code: 1
```
  • Loading branch information
surajkota authored Aug 19, 2022
1 parent 00f007c commit d3abc6a
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions sagemaker-python-sdk/pytorch_mnist/pytorch_mnist.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -204,8 +204,8 @@
"\n",
"estimator = PyTorch(entry_point='mnist.py',\n",
" role=role,\n",
" py_version='py3',\n",
" framework_version='1.8.0',\n",
" py_version='py38',\n",
" framework_version='1.11.0',\n",
" instance_count=2,\n",
" instance_type='ml.c5.2xlarge',\n",
" hyperparameters={\n",
Expand Down

0 comments on commit d3abc6a

Please sign in to comment.