You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to run the tutorial 7 on my computer. However, my program hangs before entering the run_encrypted_training function.
Can anyone help me to solve this issue?
Thanks in advance!
The error message:
% CUDA_VISIBLE_DEVICES= python3 Tutorial_7_Training_an_Encrypted_Neural_Network.py
2024-10-23 19:59:00.609660: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-10-23 19:59:00.620031: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-10-23 19:59:00.630573: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-10-23 19:59:00.633892: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-10-23 19:59:00.642926: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI AVX512_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-10-23 19:59:01.092705: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/home/whcjimmy/miniconda3/lib/python3.12/site-packages/crypten/__init__.py:334: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
result = load_closure(f, **kwargs)
/home/whcjimmy/miniconda3/lib/python3.12/site-packages/crypten/nn/onnx_converter.py:176: UserWarning: The given NumPy array is not writable, and PyTorch does not support non-writable tensors. This means writing to this tensor will result in undefined behavior. You may want to copy the array to protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:206.)
param = torch.from_numpy(numpy_helper.to_array(node))
Epoch: 0 Loss: 0.5381
/home/whcjimmy/workspace/MPCLIBS/CrypTen/tutorials/Tutorial_7_Training_an_Encrypted_Neural_Network.py:159: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
labels = torch.load('/tmp/train_labels.pth')
Process Process-2:
Traceback (most recent call last):
File "/home/whcjimmy/miniconda3/lib/python3.12/site-packages/torch/distributed/c10d_logger.py", line 83, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/whcjimmy/miniconda3/lib/python3.12/site-packages/torch/distributed/distributed_c10d.py", line 2425, in broadcast
work.wait()
RuntimeError: [../third_party/gloo/gloo/transport/tcp/unbound_buffer.cc:81] Timed out waiting 1800000ms for recv operation to complete
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/whcjimmy/miniconda3/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/home/whcjimmy/miniconda3/lib/python3.12/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/whcjimmy/miniconda3/lib/python3.12/site-packages/crypten/mpc/context.py", line 30, in _launch
return_value = func(*func_args, **func_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/whcjimmy/workspace/MPCLIBS/CrypTen/tutorials/Tutorial_7_Training_an_Encrypted_Neural_Network.py", line 168, in run_encrypted_training
File "/home/whcjimmy/miniconda3/lib/python3.12/site-packages/crypten/__init__.py", line 353, in load_from_party
result = comm.get().broadcast_obj(None, src)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/whcjimmy/miniconda3/lib/python3.12/site-packages/crypten/communicator/communicator.py", line 234, in logging_wrapper
return func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/whcjimmy/miniconda3/lib/python3.12/site-packages/crypten/communicator/distributed_communicator.py", line 318, in broadcast_obj
dist.broadcast(size, src, group=group)
File "/home/whcjimmy/miniconda3/lib/python3.12/site-packages/torch/distributed/c10d_logger.py", line 85, in wrapper
msg_dict = _get_msg_dict(func.__name__, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/whcjimmy/miniconda3/lib/python3.12/site-packages/torch/distributed/c10d_logger.py", line 51, in _get_msg_dict
def _get_msg_dict(func_name, *args, **kwargs) -> Dict[str, Any]:
The text was updated successfully, but these errors were encountered:
Hi everyone,
I'm trying to run the tutorial 7 on my computer. However, my program hangs before entering the run_encrypted_training function.
Can anyone help me to solve this issue?
Thanks in advance!
The error message:
The text was updated successfully, but these errors were encountered: