Docker and sensitive information used at run-time

We are dockerizing an application (written in Node.js) that will need to access some sensitive data at run-time (API tokens for different services) and I can’t find any recommended approach to deal with that.

Some information:

  • Wrong order of running docker containers in convox/docker-compose
  • Docker cannot build because can't find matching distribution of a requirement
  • Thousands of cAdvisor Docker containers created until disk runs out of space
  • How could I install the correct docker-compose version for 1.6.2
  • Nginx + dnsmasq = 'could not be resolved (5: Operation refused)'
  • How to cache node_modules on Docker build?
    • The sensitive information is not in our codebase, but it’s kept on another repository in encrypted format.
    • On our current deployment, without Docker, we update the codebase with git, and then we manually copy the sensitive information via SSH.
    • The docker images will be stored in a private, self-hosted registry

    I can think of some different approaches, but all of them have some drawbacks:

    1. Include the sensitive information in the Docker images at build time. This is certainly the easiest one; however, it makes them available to anyone with access to the image (I don’t know if we should trust the registry that much).
    2. Like 1, but having the credentials in a data-only image.
    3. Create a volume in the image that links to a directory in the host system, and manually copy the credentials over SSH like we’re doing right now. This is very convenient too, but then we can’t spin up new servers easily (maybe we could use something like etcd to synchronize them?)
    4. Pass the information as environment variables. However, we have 5 different pairs of API credentials right now, which makes this a bit inconvenient. Most importantly, however, we would need to keep another copy of the sensitive information in the configuration scripts (the commands that will be executed to run Docker images), and this can easily create problems (e.g. credentials accidentally included in git, etc).

    PS: I’ve done some research but couldn’t find anything similar to my problem. Other questions (like this one) were about sensitive information needed at build-time; in our case, we need the information at run-time

  • java code for producer/consumer not able to connect kafka in docker setup
  • Docker API - Exec start returns “Page not found”
  • Java + Docker: Different datasource url
  • How to mount the current working directory onto Docker container?
  • Run `rails db:setup` in Docker with schema in structure.sql
  • How to resolve a PHP-FPM Primary script unknown with a PHP-FPM and an Nginx Docker container?
  • One Solution collect form web for “Docker and sensitive information used at run-time”

    I’ve used your options 3 and 4 to solve this in the past. To rephrase/elaborate:

    Create a volume in the image that links to a directory in the host system, and manually copy the credentials over SSH like we’re doing right now.

    I use config management (Chef or Ansible) to set up the credentials on the host. If the app takes a config file needing API tokens or database credentials, I use config management to create that file from a template. Chef can read the credentials from encrypted data bag or attributes, set up the files on the host, then start the container with a volume just like you describe.

    Note that in the container you may need a wrapper to run the app. The wrapper copies the config file from whatever the volume is mounted to wherever the application expects it, then starts the app.

    Pass the information as environment variables. However, we have 5 different pairs of API credentials right now, which makes this a bit inconvenient. Most importantly, however, we would need to keep another copy of the sensitive information in the configuration scripts (the commands that will be executed to run Docker images), and this can easily create problems (e.g. credentials accidentally included in git, etc).

    Yes, it’s cumbersome to pass a bunch of env variables using -e key=value syntax, but this is how I prefer to do it. Remember the variables are still exposed to anyone with access to the Docker daemon. If your docker run command is composed programmatically it’s easier.

    If not, use the --env-file flag as discussed here in the Docker docs. You create a file with key=value pairs, then run a container using that file.

    $ cat >> myenv << END
    FOO=BAR
    BAR=BAZ
    END
    $ docker run --env-file myenv
    

    That myenv file can be created using chef/config management as described above.

    If you’re hosting on AWS you can leverage KMS here. Keep either the env file or the config file (that is passed to the container in a volume) encrypted via KMS. In the container, use a wrapper script to call out to KMS, decrypt the file, move it in to place and start the app. This way the config data is not exposed on disk.

    Docker will be the best open platform for developers and sysadmins to build, ship, and run distributed applications.