docker swarm 1.2.0 reschedule with port mapping

I’am testing the brand new version of Docker Swarm 1.2.0 and expecially the rescheduling functionnality.

So, I have one EC2 VM with swarm manager installed and 2 swarm agents (on 2 other EC2 VM). I have an HTTP Rest service I deploy through swarm like this :

  • Using a PostgreSQL database with Docker and Flask, how does it work?
  • connecting error mysql db from host to docker container
  • Postgres Docker Image: Failed to map database to host
  • Using Docker to create “restorable” MySQL database for UI testing
  • How to get mod_wsgi-express from pip using Dockerfile
  • how can I open a flask app on my browser that's running remotely on docker?
  • docker -H :4000 run -d -p :81 -e reschedule:on-node-failure myTestService
    

    This command line works fine and deploy my test service on one node (node-1).
    If I run a docker ps I see my container deployed on node one :

    CONTAINER ID        IMAGE                           COMMAND                  CREATED             STATUS              PORTS                   NAMES
    23ce231b5737        myTestService               "/nodejs/bin/npm star"   3 minutes ago       Up 3 minutes        0.0.0.0:32768->81/tcp   distracted_sinoussi
    

    Look at the port mapping : 0.0.0.0:32768->81/tcp I let docker engine choose an available port on the host (32768).

    Now, if I shutdown node-1, swarm should reschedule my container. If I look in the swarm log I have this :

    time="2016-04-19T13:56:31Z" level=info msg="Initializing discovery without TLS"
    time="2016-04-19T13:56:31Z" level=info msg="Listening for HTTP" addr=":4000" proto=tcp
    time="2016-04-19T13:56:38Z" level=info msg="Registered Engine ip-node-1 at ip.node.1:2375"
    time="2016-04-19T13:56:45Z" level=info msg="Registered Engine ip-node -2 at ip.node.2:2375"
    time="2016-04-19T13:58:24Z" level=error msg="Flagging engine as unhealthy. Connect failed 3 times" id="ZSWT:XLYS:D2HA:K5J3:O32D:AFVT:HUNR:ENKI:MBTC:2PVA:JIC2:X74L" name= ip-node-1
    time="2016-04-19T13:58:24Z" level=error msg="Error monitoring events: unexpected EOF." id="ZSWT:XLYS:D2HA:K5J3:O32D:AFVT:HUNR:ENKI:MBTC:2PVA:JIC2:X74L" name= ip-node-1
    time="2016-04-19T13:58:24Z" level=error msg="Restart event monitoring." id="ZSWT:XLYS:D2HA:K5J3:O32D:AFVT:HUNR:ENKI:MBTC:2PVA:JIC2:X74L" name= ip-node-1
    time="2016-04-19T13:58:24Z" level=error msg="Error monitoring events: Get http://ip.node.1:2375/v1.15/events: dial tcp ip.node.1:2375: getsockopt: connection refused." id="ZSWT:XLYS:D2HA:K5J3:O32D:AFVT:HUNR:ENKI:MBTC:2PVA:JIC2:X74L" name=ip-node-1
    time="2016-04-19T13:58:24Z" level=info msg="Rescheduled container 23ce231b57375a386909175f3dcd730720429eb4ed41d4366d5add17a30d210e from  ip-node-1 to  ip-node-2 as c7fe68332bc61f0f4c498848e59d3e34b58821468ce65bd4ebc92055156d5b8c"
    

    On the last line, we can see that the container has been rescheduled on node-2. Fine, lets do a little docker ps command on node-2 :

    CONTAINER ID        IMAGE                           COMMAND                  CREATED             STATUS              PORTS               NAMES
    c7fe68332bc6        myTestService                nodejs/bin/npm star"      27 seconds ago         Created                                 sleepy_hopper
    

    So, the container is there but not running (just “created”) and the port mapping is empty.

    So what’s going wrong here?

    Thank you

  • How to view php logs information with php docker container?
  • Why does docker see the container is hitting the rss limit?
  • Docker mongodb - how are data only containers portable
  • Slow meteor build performance in docker container
  • Docker varnish start with command but not with docker-compose
  • Mongodb official image errno:111 Connection refused when trying to create replicaset
  • One Solution collect form web for “docker swarm 1.2.0 reschedule with port mapping”

    I think this is the expected behaviour. If you safely shutdown the node with something like shutdown -h now, the docker daemon running on that node is as well safely stoped. This means, that the last known state to the swarm manager is actually that your containers are stoped and that’s why they are not getting started on a new node.

    Try to kill the docker daemon on the node with a kill -9 (like it would actually happen on a true failure). The containers will be rescheduled and will be started on an other node.

    Tested with swarm 1.2.1

    Docker will be the best open platform for developers and sysadmins to build, ship, and run distributed applications.