Controlling where Docker starts incremental builds (use case: git clone inside Dockerfile)

From what I understand, docker build is smart about building images incrementally, i.e. compiling only those layers where changes occured. For instance, if the source file for a COPY statement in Dockerfile changed, and everything else stayed the same, Docker will only execute statements starting from that COPY and otherwise reuse previously compiled layers.

I have a scenario where I RUN git clone inside the Docker image at build time and would like for docker build to start its incremental build from that statement (if any source file changed).

  • Docker container doesn't reload Angular app
  • How can I make my own base image for Docker?
  • docker-compose up gives error: bash: sails: command not found
  • Cannot build Docker image using spotify plugin
  • How can I set an overlay network within docker compose
  • ELK Docker - Kibana saved objects
  • I guess I could enforce this by placing a COPY dummy / just before that statement and tell Docker about changes to source files with touch dummy. Is there a better way to do this?

  • Can't install memcached package in PHP5.6-apache container
  • tunnel ssh from another pc to docker
  • Running OpenSSH in an Alpine Docker Container
  • Start Docker Containers on logon under Windows
  • Error while changing hostname inside docker container
  • Adding SSL certificates to a mysql docker container
  • 2 Solutions collect form web for “Controlling where Docker starts incremental builds (use case: git clone inside Dockerfile)”

    Take a look at the ARG instruction in Dockerfiles. Specifically this section on it’s impact on build caching.

    I have been able to solve this by following @JHarris’ lead. My Dockerfile now looks like this:

    FROM ...
    ARG ...
    ENV ...
    
    # run lengthy installs
    RUN apt-get update
    RUN apt-get install -y ...
    
    # ...
    
    ARG HEAD
    
    RUN TMP_DIR=$(mktemp -d) && \
      cd $TMP_DIR && \
      git clone $GIT_REPOSITORY && \
    # compile source code
    # install from compile
      cd $TMP_DIR && \
      rm -fr $TMP_DIR
    
    # ...
    

    And I start the build process with:

    docker build --build-arg HEAD=$(git ls-remote $GIT_REPOSITORY refs/heads/master | \
                                    cut -f1) .
    

    In effect, HEAD receives a new (hash) value whenever a new push to $GIT_REPOSITORY has occured. If that happens, it starts an “incremental” build starting from the line after ARG HEAD. The key factoid was this sentence from the Dockerfile reference (section ARG, subsection Impact on build caching):

    If a Dockerfile defines an ARG variable whose value is different from
    a previous build, then a “cache miss” occurs upon its first usage, not
    its definition. In particular, all RUN instructions following an ARG
    instruction use the ARG variable implicitly (as an environment
    variable), thus can cause a cache miss.

    This indicates that ARG HEAD must be placed as far down in Dockerfile as possible. Even though it is a definition, and could be placed further up by itself, all RUN statements following it already count as uses of HEAD. So in my example it is important to place it after the RUN apt-gets for lengthy installs, in particular.

    Docker will be the best open platform for developers and sysadmins to build, ship, and run distributed applications.