A stable and consistent development environment is essential for machine learning and deep learning research. Setting up a JupyterLab environment that integrates major deep learning frameworks like TensorFlow and PyTorch can significantly boost productivity. This blog provides a step-by-step guide to running JupyterLab with TensorFlow and PyTorch pre-installed using Docker-Compose. With this setup, you can start machine learning experiments effortlessly, without the hassle of complex configurations.
Prerequisites:
- Docker and Docker-Compose installed
Contents:
- Installing NVIDIA Drivers:
- Install the appropriate driver for your GPU using apt.
- Setting Up with Docker Compose:
- Run JupyterLab with TensorFlow and PyTorch pre-installed on your local machine using Docker.
Installing NVIDIA Drivers
* Identify the required driver version for your GPU and install it. If you don’t have a GPU, you can skip this step. Without a GPU, TensorFlow and PyTorch in JupyterLab will run on the CPU.
# check driver list
user:~$ ubuntu-drivers devices
# install driver
user:~$ sudo apt install -y nvidia-driver-550-server
* Additionally, there’s a popular “Graphics Drivers” team PPA that maintains newer NVIDIA drivers compatible with the latest versions of Ubuntu. This PPA is unofficial (not maintained by the Ubuntu team or NVIDIA) but is well-known. By adding this PPA, you can install more recent drivers. However, it’s not officially supported by Ubuntu, so it’s advisable not to use it on production machines.
# Add ppa
user:~$ sudo add-apt-repository ppa:graphics-drivers/ppa
# Update list
user:~$ sudo apt update
* Resolving TensorFlow Warning - NUMA Error Encountered
* Install additional required libraries. If they are already installed, no installation is necessary.
# Check if lspci is installed.
user:~$ which lspci
# Install pciutils (optional)
user:~$ sudo apt install pciutils
* “NUMA node read from SysFS had negative value(-1)” 오류 발생할 경우 아래와 같이 NVIDIA GPU의 NUMA 노드 파일 경로 설정을 위한 PCI Device 번호 확인.
# Check nvidia device
user:~$ lspci | grep -i nvidia
01:00.0 VGA compatible controller: NVIDIA Corporation Device 2684 (rev a1)
01:00.1 Audio device: NVIDIA Corporation Device 22ba (rev a1)
* In the above example, we confirmed that the NUMA node file path for the NVIDIA GPU is “01:00.0”. The error occurs because the NUMA node is set to “-1”, indicating that it is not assigned to a specific NUMA node. To resolve this, we set it to 0. Note that this setting will need to be reapplied after a reboot.
# Set NUMA (/sys/bus/pci/devices/0000:{PCI Device}/numa_node)
user:~$ echo 0 | sudo tee -a /sys/bus/pci/devices/0000\:01\:00.0/numa_node
* To ensure the setting persists after a reboot, register it in the crontab to run once at boot time.
# Add this command to the end of the crontab
user:~$ suco crontab -e
@reboot (echo 0 | sudo tee -a /sys/bus/pci/devices/0000\:01\:00.0/numa_node)
* Setting Up Locally with Docker Compose
* In the example, the local path for storing Jupyter notebook files is set to /data/notebook, and the path for storing large datasets for training is set to /data/notebook_data.
user:~$ sudo mkdir -p /data/notebook
user:~$ sudo mkdir -p /data/notebook_data
docker-compose.yml
services:
# define jupyterlab for machine learning
gupyterlab-tf-torch:
image: nockchun/gupyterlab-tf-torch:2.17-2.5
restart: unless-stopped
container_name: gupyterlab
volumes:
- /data/notebook:/notebook
- /data/notebook_data:/data
ports:
- "8888:8888" # jupyterlab port
- "6006:6006" # tensorboard port
command:
jupyter lab --notebook-dir=/notebook --no-browser --ip=0.0.0.0 --allow-root --NotebookApp.token="rsnet"
* Commands to Start and Stop the Container:
# start container
user:~$ docker compose up gupyterlab-tf-torch -d
# stop container
user:~$ docker container stop
# remove container
user:~$ docker container rm