Troubleshooting API timeout from Django+Celery in Docker Container

I have a micro-services architecture of let say 9 services, each one running in its own container.

The services use a mix of technologies, but mainly Django, Celery (with a Redis Queue), a shared PostgreSQL database (in its own container), and some more specific services/libraries.

  • Deleting files inside docker container not freeing up space on host system
  • What is the proper way to Clean up custom namespace content after container is deleted
  • StopIteration() after a few minutes of workers job
  • Using a volume with a redmine docker container in Bluemix
  • What is the best practice of docker + ufw under Ubuntu
  • Docker security isolation what does it mean exactly?
  • The micro-services talk to each other through REST API.

    The problem is that, sometimes in a random way, some containers API doesn’t respond anymore and get stuck.
    When I issue a curl request on their interface I get a timeout.

    At that moment, all the other containers answer well.

    There is two stucking containers.
    What I noticed is that both of the blocking containers use:

    • Django
    • django-rest-framework
    • Celery
    • django-celery
    • An embedded Redis as a Celery broker
    • An access to a PostgreSQL DB that stands in another container

    I can’t figure out how to troubleshoot the problem since no relevant information is visible in the Services or Docker logs.

    The problem is that these API’s are stuck only at random moments. To make it work again, I need to stop the blocking container, and start it again.

    I was wondering if it could be a python GIL problem, but I don’t know how to check this hypothesis…

    Any idea about how to troubleshot this?

  • Dynamically serving django docker containers
  • How to deploy django 1.8 on Elastic Beanstalk using Docker
  • Running kubernetes autoscalar
  • Invoke docker container from Jenkins pipeline which is also running as docker container on Windows for docker (for Windows 10)
  • How to test the docker containers startup time
  • Can not connect to websocket using django-channels , nginx on docker as services
  • One Solution collect form web for “Troubleshooting API timeout from Django+Celery in Docker Container”

    You can shell into the running container and check things out. Is the celery process still running, etc…

    docker exec -ti my-container-name /bin/bash

    If you are using django, for example, you could go to your django directory and do shell and start poking around there.

    I have a similar setup where I run multiple web services using django/celery/celerybeat/nginx/…

    However, as a rule I run one process per container (kind of exception is django and gunicorn run in same container). I then share things by using --volumes-from.

    For example, the gunicorn app writes to a .sock file, and the container has its own nginx config; the nginx container does a --volumes-from the django container to get this info. That way, I can use a stock nginx container for all of my web services.

    Another handy thing for debugging is to log to stdout and use docker’s log driver (splunk, logstash, etc.) for production, but have it log to the container when debugging. That way you can get a lot of information from ‘docker logs’ when you’ve got it under test. One of the great things about docker is you can take the exact code that is failing in production and run it under the microscope to debug it.

    Docker will be the best open platform for developers and sysadmins to build, ship, and run distributed applications.