You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
thanks for your effort.
when I am trying to run on AMD readon gpu it give me the following error.
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx
it seems that it expect nvidia only while if clear --gpu 0 option i can run it on cpu.
the following comman I ran on ubuntu 20 :
$ python3 main.py --task grasping_classification_cnum10_dist0_skew0_seed0 --optimizer Adam --num_epochs 10 --algorithm centralize --model cnn --pre squeezenet1_0 --fields "cr_" --learning_rate 0.001 --batch_size 16 --gpu 0 --lr_scheduler 0 --logger simple_logger
all running output s below:
023-03-31 15:33:36,200 fflow.py initialize [line:94] INFO Using Logger in algorithm.centralize
2023-03-31 15:33:36,200 fflow.py initialize [line:95] INFO Initializing fedtask: grasping_classification_cnum10_dist0_skew0_seed0
fields: ['tactileColorR']
fields: ['tactileColorR']
origin_class <class 'benchmark.grasping_classification.core.Grasping'>
origin_train_data <benchmark.grasping_classification.core.Grasping object at 0x7f78d1a11690>
origin_test_data <benchmark.grasping_classification.core.Grasping object at 0x7f78bfe026b0>
2023-03-31 15:33:39,320 fflow.py initialize [line:109] INFO Using model cnn in benchmark.grasping_classification.model.cnn as the globally shared model.
2023-03-31 15:33:39,321 fflow.py initialize [line:123] INFO No server-specific model is used.
2023-03-31 15:33:39,321 fflow.py initialize [line:135] INFO No client-specific model is used.
2023-03-31 15:33:39,321 fflow.py initialize [line:141] INFO Initializing devices: cuda:2 will be used for this running.
/home/omar/.local/lib/python3.10/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
warnings.warn(
/home/omar/.local/lib/python3.10/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or None for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing weights=SqueezeNet1_0_Weights.IMAGENET1K_V1. You can also use weights=SqueezeNet1_0_Weights.DEFAULT to get the most up-to-date weights.
warnings.warn(msg)
Traceback (most recent call last):
File "/home/omar/fl/easyFL/main.py", line 22, in
main()
File "/home/omar/fl/easyFL/main.py", line 10, in main
server = flw.initialize(option)
File "/home/omar/fl/easyFL/utils/fflow.py", line 146, in initialize
model = utils.fmodule.Model(fields=option['fields'], model_name=option['pre']).to(utils.fmodule.dev_list[0])
File "/home/omar/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1145, in to
return self._apply(convert)
File "/home/omar/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 797, in _apply
module._apply(fn)
File "/home/omar/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 797, in _apply
module._apply(fn)
File "/home/omar/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 797, in _apply
module._apply(fn)
File "/home/omar/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 820, in _apply
param_applied = fn(param)
File "/home/omar/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1143, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
File "/home/omar/.local/lib/python3.10/site-packages/torch/cuda/init.py", line 247, in _lazy_init
torch._C._cuda_init()
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx
The text was updated successfully, but these errors were encountered:
This error seems to come from model.to( torch.device(...) ) where model is an instance of torch.nn.Module(...). I think it may be related to the version of your gpu drivers and the version of torch. I cannot reproduce the same error because I don't have AMD gpu. Will the model created in another file that is independent to this project be safely put into gpu with the same environment?
Thanks for your reply.
Yes, the independent file I created and ran it same on google collab successfully, and need now to run it on my physical Linux (ubuntu) machine to accelerate the training and reduce the time that contains multi GPUs.
Also, I ran the test command in readme file of easyFL, and successfully had been executed on my physical machine.
Drivers for both torch and GPU are updated. So, I am figuring out the reason behind it.
Hi Dear,
thanks for your effort.
when I am trying to run on AMD readon gpu it give me the following error.
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx
it seems that it expect nvidia only while if clear --gpu 0 option i can run it on cpu.
the following comman I ran on ubuntu 20 :
$ python3 main.py --task grasping_classification_cnum10_dist0_skew0_seed0 --optimizer Adam --num_epochs 10 --algorithm centralize --model cnn --pre squeezenet1_0 --fields "cr_" --learning_rate 0.001 --batch_size 16 --gpu 0 --lr_scheduler 0 --logger simple_logger
all running output s below:
023-03-31 15:33:36,200 fflow.py initialize [line:94] INFO Using Logger in
algorithm.centralize
2023-03-31 15:33:36,200 fflow.py initialize [line:95] INFO Initializing fedtask: grasping_classification_cnum10_dist0_skew0_seed0
fields: ['tactileColorR']
fields: ['tactileColorR']
origin_class <class 'benchmark.grasping_classification.core.Grasping'>
origin_train_data <benchmark.grasping_classification.core.Grasping object at 0x7f78d1a11690>
origin_test_data <benchmark.grasping_classification.core.Grasping object at 0x7f78bfe026b0>
2023-03-31 15:33:39,320 fflow.py initialize [line:109] INFO Using model
cnn
inbenchmark.grasping_classification.model.cnn
as the globally shared model.2023-03-31 15:33:39,321 fflow.py initialize [line:123] INFO No server-specific model is used.
2023-03-31 15:33:39,321 fflow.py initialize [line:135] INFO No client-specific model is used.
2023-03-31 15:33:39,321 fflow.py initialize [line:141] INFO Initializing devices: cuda:2 will be used for this running.
/home/omar/.local/lib/python3.10/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
warnings.warn(
/home/omar/.local/lib/python3.10/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or
None
for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passingweights=SqueezeNet1_0_Weights.IMAGENET1K_V1
. You can also useweights=SqueezeNet1_0_Weights.DEFAULT
to get the most up-to-date weights.warnings.warn(msg)
Traceback (most recent call last):
File "/home/omar/fl/easyFL/main.py", line 22, in
main()
File "/home/omar/fl/easyFL/main.py", line 10, in main
server = flw.initialize(option)
File "/home/omar/fl/easyFL/utils/fflow.py", line 146, in initialize
model = utils.fmodule.Model(fields=option['fields'], model_name=option['pre']).to(utils.fmodule.dev_list[0])
File "/home/omar/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1145, in to
return self._apply(convert)
File "/home/omar/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 797, in _apply
module._apply(fn)
File "/home/omar/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 797, in _apply
module._apply(fn)
File "/home/omar/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 797, in _apply
module._apply(fn)
File "/home/omar/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 820, in _apply
param_applied = fn(param)
File "/home/omar/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1143, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
File "/home/omar/.local/lib/python3.10/site-packages/torch/cuda/init.py", line 247, in _lazy_init
torch._C._cuda_init()
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx
The text was updated successfully, but these errors were encountered: