Getting started with Ubuntu 16.04 LTS for Deep Learning and AI

Patrick · May 12, 2017

I am just going to take a few notes of what I do to start new systems for machine learning.

Get basic tools installed for Ubuntu including Docker:

Code:

sudo apt-get update && sudo apt-get upgrade -y
sudo apt-get install build-essential autoremove git
wget -qO- https://get.docker.com/ | sh
sudo usermod -aG docker $USER
sudo reboot

Download drivers for NVIDIA GPUs: Drivers | GeForce (you want Linux 64-bit). Once you get them on the machine (note need the direct download link):

Code:

chmod +x NVIDIA-Linux-x86_64-381.22.run
sudo ./NVIDIA-Linux-x86_64-381.22.run

Run through the prompts. If you installed an x server and you are getting an error you can add--no-x-check

Install nvidia-docker:

Code:

# Install nvidia-docker and nvidia-docker-plugin
wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.1/nvidia-docker_1.0.1-1_amd64.deb
sudo dpkg -i /tmp/nvidia-docker*.deb && rm /tmp/nvidia-docker*.deb

# Test nvidia-smi
nvidia-docker run --rm nvidia/cuda nvidia-smi

_alex · May 17, 2017

recently started to do some reading about tensorflow et al
would you recommend to start from scratch or use pre-built and trained models like inception and fine-tune /re-learn ?

really short-off getting one or two smaller GPU and jumping into this a bit - also because i'll deploy a 2u dr with s2600 and dual 2670 that has some pcie free and no metered power-plan at the colo.

would be too bad to let it idle 99% of it's time ...

Patrick · May 17, 2017

@_alex it depends on what you want to do. You can also use trained models and then train them for a new/ specific purpose saving a lot of time but still doing training.

My personal advice is that if you can get a GTX 1070 instead of a lower-end GPU, it is worth it. Here is our recent guide NVIDIA Deep Learning / AI GPU Value Comparison Q2 2017

The first few weeks of learning you are likely to have very small models that train in seconds, potentially 5-60 minutes. As you start working with more data, you are going to hit larger models, especially with adversarial networks for example. The model we used in that piece took 3 hours less on a 1080 v. a 1060 (~7 v. ~10 hours.) Since it was memory capacity bound, the 1070 performed well too. The point is a $100 upgrade for 30% more speed was a 3-hour impact, and that is still a small training effort.

_alex · May 17, 2017

yes, read the main site with great interest, have a dataset of around 4k+ classified images i could split for training, plus about the same amount or even more of community-submitted Images that can be easily classified (by Community, at least partially).

guess letting two models work against each other on these could maybe lead to what i'd like the ai to be able to do.

maybe i'll just start with a smaller sample and CPU to get a feeling what could be done on it and how much resources/GPU really needed. Time is really not critical, would just let it run on that dr/backup-machine for days or weeks and improve over time.

Search

Getting started with Ubuntu 16.04 LTS for Deep Learning and AI

Patrick

Administrator

_alex

Active Member

Patrick

Administrator

_alex

Active Member