Using GPU from a docker container?

I’m searching for a way to use the GPU from inside a docker container.

The container will execute arbitrary code so i don’t want to use the privileged mode.

  • Docker - copy file from container to host
  • Dotnet Core publish to IIS from Mac
  • Docker: Find dockerfile of images
  • How to get all Docker containers even the ones who aren't running?
  • Mounted docker volumes corrupting files
  • Why sometimes Java process builder only get 1 line of the process output?
  • Any tips?

    From previous research i understood that run -v and/or LXC cgroup was the way to go but i’m not sure how to pull that off exactly

  • Tensorflow image_retraining:retrain in docker is not found
  • How do I migrate my current deployment methodology to one using Docker?
  • golang net.LookupHost in docker container return 127.0.53.53
  • why different output from zabbix_get and zabbix_agentd command?
  • remote docker commands execution
  • Which is the Dockerfile encoding?
  • 5 Solutions collect form web for “Using GPU from a docker container?”

    Regan’s answer is great, but it’s a bit out of date, since the correct way to do this is avoid the lxc execution context as Docker has dropped LXC as the default execution context as of docker 0.9.

    Instead it’s better to tell docker about the nvidia devices via the –device flag, and just use the native execution context rather than lxc.

    Environment

    These instructions were tested on the following environment:

    • Ubuntu 14.04
    • CUDA 6.5
    • AWS GPU instance.

    Install nvidia driver and cuda on your host

    See CUDA 6.5 on AWS GPU Instance Running Ubuntu 14.04 to get your host machine setup.

    Install Docker

    $ sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 36A1D7869245C8950F966E92D8576A8BA88D21E9
    $ sudo sh -c "echo deb https://get.docker.com/ubuntu docker main > /etc/apt/sources.list.d/docker.list"
    $ sudo apt-get update && sudo apt-get install lxc-docker
    

    Find your nvidia devices

    ls -la /dev | grep nvidia
    
    crw-rw-rw-  1 root root    195,   0 Oct 25 19:37 nvidia0 
    crw-rw-rw-  1 root root    195, 255 Oct 25 19:37 nvidiactl
    crw-rw-rw-  1 root root    251,   0 Oct 25 19:37 nvidia-uvm
    

    Run Docker container with nvidia driver pre-installed

    I’ve created a docker image that has the cuda drivers pre-installed. The dockerfile is available on dockerhub if you want to know how this image was built.

    You’ll want to customize this command to match your nvidia devices. Here’s what worked for me:

     $ sudo docker run -ti --device /dev/nvidia0:/dev/nvidia0 --device /dev/nvidiactl:/dev/nvidiactl --device /dev/nvidia-uvm:/dev/nvidia-uvm tleyden5iwx/ubuntu-cuda /bin/bash
    

    Verify CUDA is correctly installed

    This should be run from inside the docker container you just launched.

    Install CUDA samples:

    $ cd /opt/nvidia_installers
    $ ./cuda-samples-linux-6.5.14-18745345.run -noprompt -cudaprefix=/usr/local/cuda-6.5/
    

    Build deviceQuery sample:

    $ cd /usr/local/cuda/samples/1_Utilities/deviceQuery
    $ make
    $ ./deviceQuery   
    

    If everything worked, you should see the following output:

    deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 6.5, CUDA Runtime Version = 6.5, NumDevs =    1, Device0 = GRID K520
    Result = PASS
    

    Ok i finally managed to do it without using the –privileged mode.

    I’m running on ubuntu server 14.04 and i’m using the latest cuda (6.0.37 for linux 13.04 64 bits).


    Preparation

    Install nvidia driver and cuda on your host. (it can be a little tricky so i will suggest you follow this guide https://askubuntu.com/questions/451672/installing-and-testing-cuda-in-ubuntu-14-04)

    ATTENTION : It’s really important that you keep the files you used for the host cuda installation


    Get the Docker Daemon to run using lxc

    We need to run docker daemon using lxc driver to be able to modify the configuration and give the container access to the device.

    One time utilization :

    sudo service docker stop
    sudo docker -d -e lxc
    

    Permanent configuration
    Modify your docker configuration file located in /etc/default/docker
    Change the line DOCKER_OPTS by adding ‘-e lxc’
    Here is my line after modification

    DOCKER_OPTS="--dns 8.8.8.8 --dns 8.8.4.4 -e lxc"
    

    Then restart the daemon using

    sudo service docker restart
    

    How to check if the daemon effectively use lxc driver ?

    docker info
    

    The Execution Driver line should look like that :

    Execution Driver: lxc-1.0.5
    

    Build your image with the NVIDIA and CUDA driver.

    Here is a basic Dockerfile to build a CUDA compatible image.

    FROM ubuntu:14.04
    MAINTAINER Regan <http://stackoverflow.com/questions/25185405/using-gpu-from-a-docker-container>
    
    RUN apt-get update && apt-get install -y build-essential
    RUN apt-get --purge remove -y nvidia*
    
    ADD ./Downloads/nvidia_installers /tmp/nvidia                             > Get the install files you used to install CUDA and the NVIDIA drivers on your host
    RUN /tmp/nvidia/NVIDIA-Linux-x86_64-331.62.run -s -N --no-kernel-module   > Install the driver.
    RUN rm -rf /tmp/selfgz7                                                   > For some reason the driver installer left temp files when used during a docker build (i don't have any explanation why) and the CUDA installer will fail if there still there so we delete them.
    RUN /tmp/nvidia/cuda-linux64-rel-6.0.37-18176142.run -noprompt            > CUDA driver installer.
    RUN /tmp/nvidia/cuda-samples-linux-6.0.37-18176142.run -noprompt -cudaprefix=/usr/local/cuda-6.0   > CUDA samples comment if you don't want them.
    RUN export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64         > Add CUDA library into your PATH
    RUN touch /etc/ld.so.conf.d/cuda.conf                                     > Update the ld.so.conf.d directory
    RUN rm -rf /temp/*  > Delete installer files.
    

    Run your image.

    First you need to identify your the major number associated with your device.
    Easiest way is to do the following command :

    ls -la /dev | grep nvidia
    

    If the result is blank, use launching one of the samples on the host should do the trick.
    The result should look like that
    enter image description here
    As you can see there is a set of 2 numbers between the group and the date.
    These 2 numbers are called major and minor numbers (wrote in that order) and design a device.
    We will just use the major numbers for convenience.

    Why do we activated lxc driver?
    To use the lxc conf option that allow us to permit our container to access those devices.
    The option is : (i recommend using * for the minor number cause it reduce the length of the run command)

    –lxc-conf=’lxc.cgroup.devices.allow = c [major number]:[minor number or *] rwm’

    So if i want to launch a container (Supposing your image name is cuda).

    docker run -ti --lxc-conf='lxc.cgroup.devices.allow = c 195:* rwm' --lxc-conf='lxc.cgroup.devices.allow = c 243:* rwm' cuda
    

    We just released an experimental GitHub repository which should ease the process of using NVIDIA GPUs inside Docker containers.

    Updated for cuda-8.0 on ubuntu 16.04

    Dockerfile

    FROM ubuntu:16.04
    MAINTAINER Jonathan Kosgei <jonathan@saharacluster.com>
    
    # A docker container with the Nvidia kernel module and CUDA drivers installed
    
    ENV CUDA_RUN https://developer.nvidia.com/compute/cuda/8.0/prod/local_installers/cuda_8.0.44_linux-run
    
    RUN apt-get update && apt-get install -q -y \
      wget \
      module-init-tools \
      build-essential 
    
    RUN cd /opt && \
      wget $CUDA_RUN && \
      chmod +x cuda_8.0.44_linux-run && \
      mkdir nvidia_installers && \
      ./cuda_8.0.44_linux-run -extract=`pwd`/nvidia_installers && \
      cd nvidia_installers && \
      ./NVIDIA-Linux-x86_64-367.48.run -s -N --no-kernel-module
    
    RUN cd /opt/nvidia_installers && \
      ./cuda-linux64-rel-8.0.44-21122537.run -noprompt
    
    # Ensure the CUDA libs and binaries are in the correct environment variables
    ENV LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-8.0/lib64
    ENV PATH=$PATH:/usr/local/cuda-8.0/bin
    
    RUN cd /opt/nvidia_installers &&\
        ./cuda-samples-linux-8.0.44-21122537.run -noprompt -cudaprefix=/usr/local/cuda-8.0 &&\
        cd /usr/local/cuda/samples/1_Utilities/deviceQuery &&\ 
        make
    
    WORKDIR /usr/local/cuda/samples/1_Utilities/deviceQuery
    
    1. Run your container

    sudo docker run -ti --device /dev/nvidia0:/dev/nvidia0 --device /dev/nvidiactl:/dev/nvidiactl --device /dev/nvidia-uvm:/dev/nvidia-uvm <built-image> ./deviceQuery

    You should see output similar to:

    deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = GRID K520
    Result = PASS

    Recent enhancements by NVIDIA have produced a much more robust way to do this.

    Essentially they have found a way to avoid the need to install the CUDA/GPU driver inside the containers and have it match the host kernel module.

    Instead, drivers are on the host and the containers don’t need them.
    It requires a modified docker-cli right now.

    This is great, because now containers are much more portable.

    enter image description here

    A quick test on Ubuntu:

    # Install nvidia-docker and nvidia-docker-plugin
    wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.1/nvidia-docker_1.0.1-1_amd64.deb
    sudo dpkg -i /tmp/nvidia-docker*.deb && rm /tmp/nvidia-docker*.deb
    
    # Test nvidia-smi
    nvidia-docker run --rm nvidia/cuda nvidia-smi
    

    For more details see:
    GPU-Enabled Docker Container
    and: https://github.com/NVIDIA/nvidia-docker

    Docker will be the best open platform for developers and sysadmins to build, ship, and run distributed applications.