Troubleshooting API timeout from Django+Celery in Docker Container

I have a micro-services architecture of let say 9 services, each one running in its own container.

The services use a mix of technologies, but mainly Django, Celery (with a Redis Queue), a shared PostgreSQL database (in its own container), and some more specific services/libraries.

  • docker swarm init on windows container output nothing
  • Mounted host file becomes a directory inside a docker container
  • Run mysql instance in Docker container
  • Cannot connect to MySQL within running docker container
  • How to access web page served by nginx web server running in docker container
  • Are there any simple python packages that will run a prebuilt docker image
  • The micro-services talk to each other through REST API.

    The problem is that, sometimes in a random way, some containers API doesn’t respond anymore and get stuck.
    When I issue a curl request on their interface I get a timeout.

    At that moment, all the other containers answer well.

    There is two stucking containers.
    What I noticed is that both of the blocking containers use:

    • Django
    • django-rest-framework
    • Celery
    • django-celery
    • An embedded Redis as a Celery broker
    • An access to a PostgreSQL DB that stands in another container

    I can’t figure out how to troubleshoot the problem since no relevant information is visible in the Services or Docker logs.

    The problem is that these API’s are stuck only at random moments. To make it work again, I need to stop the blocking container, and start it again.

    I was wondering if it could be a python GIL problem, but I don’t know how to check this hypothesis…

    Any idea about how to troubleshot this?

  • Use Docker to compile and run untrusted code in Django App [closed]
  • Why docker uses port numbers from 32768-65535?
  • Django celery in docker looking for already deleted tasks
  • Docker NGINX setup to run Django project is giving invalid port specification error
  • make new docker container after it has been deleted
  • Postgres docker-image won't run and won't start [closed]
  • One Solution collect form web for “Troubleshooting API timeout from Django+Celery in Docker Container”

    You can shell into the running container and check things out. Is the celery process still running, etc…

    docker exec -ti my-container-name /bin/bash

    If you are using django, for example, you could go to your django directory and do manage.py shell and start poking around there.

    I have a similar setup where I run multiple web services using django/celery/celerybeat/nginx/…

    However, as a rule I run one process per container (kind of exception is django and gunicorn run in same container). I then share things by using --volumes-from.

    For example, the gunicorn app writes to a .sock file, and the container has its own nginx config; the nginx container does a --volumes-from the django container to get this info. That way, I can use a stock nginx container for all of my web services.

    Another handy thing for debugging is to log to stdout and use docker’s log driver (splunk, logstash, etc.) for production, but have it log to the container when debugging. That way you can get a lot of information from ‘docker logs’ when you’ve got it under test. One of the great things about docker is you can take the exact code that is failing in production and run it under the microscope to debug it.

    Docker will be the best open platform for developers and sysadmins to build, ship, and run distributed applications.