How to install a python package with all the dependencies into a Docker image?

I’m working in Ubuntu 15.10 with the Docker container for Pyspark jupyter/pyspark-notebook. I need to install folium with all it’s dependencies and run a Pyspark script into the container. I successfully installed Docker, pulled the image and run it with the command

docker run -d -p 8888:8888 -p 4040:4040 -v /home/$MYUSER/$MYPROJECT:/home/jovyan/work jupyter/pyspark-notebook

Then, I execute the code example without any issues

  • How to enable user defined functions in docker instance of cassandra?
  • Only some locally built Docker images fail to work on remote server (error: “No command specified”)
  • Docker unreachable after computer sleep
  • Communication between two containers on the same host
  • Keep gitlab build running after build
  • Logstash in Docker - config file not found when mounted though a volume
  • import pyspark
    sc = pyspark.SparkContext('local[*]')
    
    # do something to prove it works
    rdd = sc.parallelize(range(1000))
    rdd.takeSample(False, 5)
    

    I looked for the conda environment in /opt/conda (as it says in the documentation) but there is no conda in my /opt folder. Then, I installed miniconda3 and folium with all the dependencies as a normal Python package (no Docker involved).

    It doesn’t work. When I run the image and try to import the package with import folium it doesn’t find the folium package:

    ImportErrorTraceback (most recent call last)
    <ipython-input-1-af6e4f19ef00> in <module>()
    ----> 1 import folium
    
    ImportError: No module named 'folium'
    

    So the problem can be reduced to two questions:

    1. Where is the container’s conda?
    2. How can I install the Python package I need into the container?

    Thanks in advance for your help!

  • Docker-Tuleap-List all Tuleap database tables running in docker container
  • Current Base Device UUID: does not match with stored UUID:
  • using docker plugin on jenkins -
  • IntelliJ autocompletion and docker
  • Simple Continuous Deliver procedure for docker in aws
  • How to use Nomad with Nvidia Docker?
  • One Solution collect form web for “How to install a python package with all the dependencies into a Docker image?”

    To answer the first question Where is the conda environment? we just need to execute in console $ docker my_containers_name ls /opt/conda.

    Second question has two options:

    • We can open the containers console by executing the command

      $ docker exec -it my_containers_name /bin/bash

      and install the package like a normal conda package

      conda install --channel https://conda.anaconda.org/conda-forge folium

    • We can modify the Dockerfile of the Docker image or create a new one extending the previous one. To create a new Dockerfile and add the lines

      FROM jupyter/minimal-notebook
      USER jovyan
      RUN conda install --quiet --yes --channel https://conda.anaconda.org/conda-forge folium && conda clean -tipsy
      

      And build our new image. If we want to modify the original Dockerfile we must skip the first line.

    I create my own Dockerfile by forking the original project.

    Thanks warmoverflow and ShanShan for your comments

    Docker will be the best open platform for developers and sysadmins to build, ship, and run distributed applications.