This is a quick quide (checklist) of things you need to do to get a Ubuntu GPU box
up and going.
- Make a USB bootable drive
- Main drive must be
dev/sda
, otherwise it fails. Newer NVMe SSDs that communicate over PCIe, show up asdev/nvme0
. If selecting to install on such an SSD, disconnect all other SSDs when installing. - NO quickBoot / FastBoot option in the BIOS, otherwise you cannot reboot
- Update Ubuntu's packages, otherwise you use old stuff
- Install
git
,vim
,tmux
,htop
andtree
, otherwise nothing works - Add exFAT file system read/write support
- Update terminal's colours, configurations and substitute
<Caps lock>
with additional<Ctrl>
, otherwise it looks and feels like s**t - Assign static IP
- Install a
ssh
server - Install CUDA Toolkit
- Update graphics driver
- Install NCCL
- Edit Terminal's settings
- Install Torch7
- Change ownership of
usr/local
- Mount on boot
This instructions apply to the OSX platform.
Open Disk Utility
and choose 1 partition
to format as Free Space
.
Convert the iso
into a dmg
with
hdiutil convert -format UDRW -o ubuntu-image ubuntu*desktop-amd64.iso
find out where the USB drive is mounted
diskutil list
(for example /dev/diskNB
), and write the image on it
dd if=./ubuntu-image.dmg of=/dev/diskNB bs=1m
sudo apt-get update
sudo apt-get dist-upgrade -y
sudo apt-get autoremove -y
sudo apt-get install -y git vim tmux htop tree
sudo apt-get install -y exfat-utils exfat-fuse
Go here, and go through it.
Network Connection
-> Edit...
-> IPv4 Settings
-> Method:
-> Manual
-> Add
.
To get the IP address of your name you can ping
it in the terminal (ping <myStaticName.ecn.purdue.edu>
).
Configurations for DNS servers, gateway and sub-net mask can be found on the corresponding ECN webpage.
sudo apt-get install -y openssh-server
Let's download the latest CUDA Toolkit available from nVIDIA website for your Ubuntu version. ssh
remotely from another machine (or use a virtual console with <Ctrl>
-<Alt>
-F1
to F6
).
sudo service lightdm stop
cd Downloads
chmod +x cuda*
sudo ./cuda*
DO NOT install the OpenGL libraries if you are planning to use your integrated graphic card for displaying purpose.
Accept everything else and press enter for default locations.
Add the following lines to your ~/.bashrc
.
# CUDA
echo PATH=$PATH':/usr/local/cuda/bin' >> ~/.bashrc
echo LD_LIBRARY_PATH=$LD_LIBRARY_PATH':/usr/local/cuda/lib64' >> ~/.bashrc
You can try to run one of the Samples to test if everything went well. Before that, get the metapackage build-essential
which will install gcc
compiler and other related packages (sudo apt-get install -y build-essential
). cd into ~/NVIDIA_CUDA*Samples/1_Utilities/deviceQuery
, make
and then ./deviceQuery
.
sudo reboot
your system.
If you made any mistake and you want to uninstall CUDA Toolkit, then run the uninstall script in /usr/local/cuda-7.5/bin
. To uninstall NVIDIA driver, run NVIDIA-Linux-x86-310.19.run --uninstall
.
Check if your drivers are up to date (compare what you get with nvidia-smi
with what you can find on nVIDIA webpage).
If they are not, update your system like you just did for installing the CUDA Toolkit but with the driver run
package instead.
As mentioned before, if you are planning to use your integrated graphic card for displaying purpose, DO NOT install the OpenGL libraries nor update your /etc/X11/xorg.conf
! In order to doing so you can digit (with the correct version numbering):
sudo service lightdm stop
cd ~/Downloads
chmod +x NVIDIA*
sudo ./NVIDIA-Linux-x86_64-352.41.run --no-opengl-files
sudo reboot
git clone https://github.com/NVIDIA/nccl
cd nccl
make install
update library cache
sudo ldconfig
install lua nccl binding
luarocks install nccl
Edit
-> Profile Preferences
-> Scrolling
-> check Unlimited
.
Torch7 installation instructions have been taken from 'here'.
Below instructions will clone the Torch repo and install all the basic packages such as nn
, cutorch
, cunn
, cudnn
(if you have CUDA).
# in a terminal, run the commands
git clone https://github.com/torch/distro.git ~/torch --recursive
cd ~/torch; bash install-deps;
./install.sh
Since /usr/local
is belonging to the (administrator) user, we enforce this with
sudo chown -R me:me /usr/local
sudo rm -rf ~/.cache/luarocks
Go to Disks
utility, select target disk --> More actions --> Edit Mount Options.
- Automatic Mount Options set OFF
- Change
nosuid,nodev,nofail,x-gvfs-show
tonosuid,nodev,nofail,comment=x-gvfs-show
- Mount point:
/media/HDDx
, withx
= 1, 2, ...