I want to run the notebook from Guillaume Lambard on our server. Operating system is a Fedora 29 because it is the last version supported by the NVIDIA cuda toolkit. The GPU is a TESLA K-20.
Install the CUDA tookit¶
dnf config-manager --add-repo http://developer.download.nvidia.com/compute/cuda/repos/fedora$(rpm -E %fedora)/x86_64/cuda-fedora$(rpm -E %fedora).repo
dnf clean all
dnf install gcc-c++ mesa-libGLU-devel libX11-devel libXi-devel libXmu-devel
dnf install kernel-devel-$(uname -r) kernel-headers-$(uname -r)
sudo dnf clean expire-cache
sudo dnf install cuda cuda-drivers cuda-toolkit-10.2
You need to reboot the server to load the driver.
Check your GPU with
nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K20m Off | 00000000:04:00.0 Off | 0 |
| N/A 31C P0 54W / 225W | 0MiB / 4743MiB | 99% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Install cuDNN¶
cuDDN is available with a NVIDIA developer free account. I take the CENTOS/RedHat versions for CUDA 10.1:
libcudnn7-7.6.5.33-1.cuda10.1.x86_64.rpm
libcudnn7-devel-7.6.5.33-1.cuda10.1.x86_64.rpm
libcudnn7-doc-7.6.5.33-1.cuda10.1.x86_64.rpm
rpm -Uvh libcuddn*.rpm
Install tensorflow-gpu¶
The tensorflow-gpu
pip package doesn’t work because it is built with CUDA 10.0. And CUDA 10.0 is not available with Fedora 29. So you need to build it from scratch.
To build tensorflow you need some dependencies available only on rpmfusion:
sudo dnf install https://download1.rpmfusion.org/free/fedora/rpmfusion-free-release-$(rpm -E %fedora).noarch.rpm
sudo dnf install libnvinfer-devel libnvinfer-plugin6
Download tensorflow¶
I want the version 1.0 of tensorflow:
wget https://github.com/tensorflow/tensorflow/archive/v1.15.2.tar.gz
tar zxf v1.15.2.tar.gz
cd tensorflow-1.15.2/
Create a conda environment where I will install the software¶
conda create -n tensorflow-gpu-1.15.2 python=3.6
conda activate tensorflow-gpu-1.15.2
Install bazel¶
To build tensorflow, you need bazel which is a compilation tool. The fedora version is too old and the conda version is too early. I try to install 0.24 or 0.26 bazel versions with conda but it did not work. Conda crashes so I download the linux binary file:
wget https://releases.bazel.build/0.26.1/release/bazel-0.26.1-linux-x86_64
chmod +x ./bazel-0.26.1-linux-x86_64
mv bazel-0.26.1-linux-x86_64 $CONDA_PREFIX/bin/bazel
Install dependencies¶
I follow instructions on tensorflow website and use pip:
python -m pip install pip six numpy wheel setuptools mock 'future>=0.17.1'
python -m pip install keras_applications --no-deps
python -m pip install keras_preprocessing --no-deps
Build tensorflow from sources and install¶
Just answer to the questions:
./configure
And launch the long build (maybe hours)
bazel build --config=v1 //tensorflow/tools/pip_package:build_pip_package
./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
python -m pip install /tmp/tensorflow_pkg/tensorflow-1.15.2-cp36-cp36m-linux_x86_64.whl
Run the notebook¶
To run the notebook you need to install some python packages and Keras.
Don’t forget to set variables below to be sure that Keras will use the tensorflow backend and the GPU. The value of CUDA_VISIBLE_DEVICES
depends on what you get with nvidia-smi
.
I also create a new jupyter kernel named tf-gpu
. Don’t forget to switch to it the first time you execute the notebook.
conda install -y pandas scikit-learn matplotlib
conda install -y gpy gpyopt rdkit
conda install -y keras ipykernel
export CUDA_VISIBLE_DEVICES=0
export KERAS_BACKEND=tensorflow
python -m ipykernel install --user --name tf-gpu
jupyter notebook
Enjoy :-)