Is it a good idea to run Cassandra inside an LXC or Docker, in production?

I know it runs just fine, so it’s ok for development which is great, but won’t it have considerably worse disk and/or network IO performance because of AuFS ?

  • 'undo' or 'cancel' dockerfile VOLUME to share mysql DB in registry
  • gdb does not hit any breakpoints when I run it from inside Docker container
  • How to use docker deploy in docker-compose 3?
  • Running nginx as non-root user on Openshift and listening on port 80
  • Getting started with Laradock on Mac
  • Building A Docker RPM on Fedora
  • How to properly deploy to host from gitlab-ci (+docker)?
  • Docker memory leak with sonarqube
  • Running yum update in fedora 21 makes a very large image
  • Can't create docker swarm service
  • Fetching AWS instance metadata from within Docker container?
  • docker-compose got error when bringing up nodejs application docker container
  • 3 Solutions collect form web for “Is it a good idea to run Cassandra inside an LXC or Docker, in production?”

    If you put Cassandra data on a volume, disk I/O performance will be exactly the same as outside of containers, since AUFS will be bypassed entirely.

    And even if you don’t use a volume, performance will be fine as long as you don’t commit Cassandra data into a new image to run that image later. And even if you do that, performance will be affected only during the first writes on each file; after that, it will be native.

    You will not see any different in Network I/O performance, unless your containers are dealing with 100s of Mb/s of network traffic and/or 1000s of connections per second. In that case, you can use tools like Pipework to assign MAC VLAN interfaces or even native physical interfaces to your containers.

    We are actually running Cassandra in Docker in production and have had to work through a lot of performance issues.

    Networking: you should this as –net=host to use the host networking. Otherwise you will take a substantial hit to your network speeds. See this article for more information on recommend best practices.

    Data volume: you should expose your data volume to the physical host. If you’re operating in the cloud note that where you place your data volume may limit your iops.

    JVM: just because you run Cassandra in a container doesn’t mean you can get away from tuning your jvm. You still need to modify it to account for the system resources on the host machine.

    Cluster Name/Seeds: these need to be configured and need to be changed from hard coded values to find and replace with environment variables using sed.

    The big take away is that like any software you need to do some configuration. It’s not 100% plug and play.

    Looking into the same thing, Just found this on slideshare:

    “Docker uses Linux Ethernet Bridges for basic software routing. This will hose your network throughput. (50% hit)
    Use the host network stack instead (10% hit)”

    Docker will be the best open platform for developers and sysadmins to build, ship, and run distributed applications.