Skip to content

Latest commit

 

History

History
156 lines (126 loc) · 4.62 KB

tensorflow-setup.md

File metadata and controls

156 lines (126 loc) · 4.62 KB

Below are instructions on how to setup completion3d training in tensorflow. New networks can be added following the template in tensorflow/models/TopNet.py and adding corresponding import statements in tensorflow/main.py and tensorflow/utils/train_utils.py Feel free to submit new model additions to the benchmark as a pull request.

Clone Repository

git clone [email protected]:lynetcha/completion3d.git

Install CUDA and CUDNN

Instructions below assume CUDA 9.0 is installed in /usr/local/cuda

export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

How to install cuda9.0 with imcompatible Ubuntu 18.04 ?

follow instructions on page:
https://github.com/akirademoss/cuda-9.0-installation-on-ubuntu-18.04

encounter missing lib even though the nvcc --version is 9.0 and with correct cuda installation.

check makefile, change the lib dependece path, the absolute path is wrong. charlesq34/pointnet2#36 (comment)

gcc version imcompatible:

#error -- unsupported GNU version! gcc versions later than 6 are not supported! ethereum-mining/ethminer#731 (comment)

remember this tensorflow version need CUDA9.0 and gcc6 to complile cuda version ChamferDistance
So the just follow all instuctions on "https://github.com/akirademoss/cuda-9.0-installation-on-ubuntu-18.04"

1. install gcc 6 so as to compile cuda script:

1.1 remove default gcc 7 if you see the soft link /usr/bin/gcc refering to /usr/bin/gcc-7*:

$ sudo apt remove gcc   OR RUN CMD   sudo apt autoremove gcc

1.2 install gcc 6: follow link:

$ sudo apt-get install gcc-6 g++-6 -y
$ sudo ln -s /usr/bin/gcc-6 /usr/bin/gcc
$ sudo ln -s /usr/bin/g++-6 /usr/bin/g++
$ sudo ln -s /usr/bin/gcc-6 /usr/bin/cc
$ sudo ln -s /usr/bin/g++-6 /usr/bin/c++

1.3 check gcc and g++ version:

$ g++ -v
$ gcc -v

2. install cuda 9.0

2.1 download the cuda running file. (also can download it from official website, local running file.)

wget https://developer.nvidia.com/compute/cuda/9.0/Prod/local_installers/cuda_9.0.176_384.81_linux-run

2.2 add mode and run:

chmod +x cuda_9.0.176_384.81_linux.run 
sudo ./cuda_9.0.176_384.81_linux.run --override

2.3 answer the questions:

You are attempting to install on an unsupported configuration. Do you wish to continue? y
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 384.81? n
Install the CUDA 9.0 Toolkit? y
Do you want to create soft link? y
Do you want install cuda 9.0 examples? n

3. install cudnn 7

cd ~/Downloads
tar -xzvf cudnn-9.0-linux-x64-v7.3.0.29.tgz

sudo cp -P cuda/include/cudnn.h /usr/local/cuda-9.0/include
sudo cp -P cuda/lib64/libcudnn* /usr/local/cuda-9.0/lib64/
sudo chmod a+r /usr/local/cuda-9.0/lib64/libcudnn*

4. setup environment PATH:

echo 'export PATH=/usr/local/cuda/bin:$PATH' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc
source ~/.bashrc

5. verife the installation.

5.1 reboot the computer

sudo systemctl reboot

OR

sudo systemctl reboot -i

5.2 check if the version is 9.0

nvidia-smi
nvcc -V

6 link previous installed gcc 6

sudo apt install gcc-6 g++-6
sudo ln -s /usr/bin/gcc-6 /usr/local/cuda/bin/gcc
sudo ln -s /usr/bin/g++-6 /usr/local/cuda/bin/g++

Tensorflow Python Environment

cd completion3d/tensorflow
PYTHON_BIN=/path/to/python3.6
virtualenv -p $PYTHON_BIN comp3d_tf_venv
source comp3d_tf_venv/bin/activate
pip install -r ../requirements/tensorflow-requirements.txt

Build Chamfer and EMD functions

cd utils/pc_distance
make
cd ../../../

Run Tensorflow Training/Testing

cd tensorflow

Link data (see data-setup.md)

ln -s /path/to/data data

Modify parameters in run.sh

chmod +x run.sh
./run.sh

Benchmark submission instructions

To submit to the completion3d benchmark, set TRAIN=0 and BENCHMARK=1 in run.sh and run the script with parameters to evaluate. A submission.zip file will be generated by the script in the experiment output folder.

python -m debugpy --listen 5678 ./main.py --epochs 300 --lr 0.5e-2 --batch_size 32 --nworkers 4 --NET TopNet --dataset shapenet --pc_augm_scale 0 --pc_augm_rot 1 --pc_augm_mirror_prob 0.5 --eval 0 --optim adagrad --code_nfts 1024 --resume 0 --npts 2048 --ENCODER_ID 1 --dist_fun chamfer --save_nth_epoch 5 --test_nth_epoch 5 --benchmark 0 --NLEVELS 6 --NFEAT 8