How to set up autoscaling RabbitMQ Cluster AWS

I’m trying to move away from SQS to RabbitMQ for messaging service. I’m looking to build a stable high availability queuing service. For now I’m going with cluster.

Current Implementation ,
I have three EC2 machines with RabbitMQ with management plugin installed in a AMI , and then I explicitly go to each of the machine and add

  • Running RStudio Shiny as a Docker container on AWS?
  • Elastic Beanstalk CLI, how do I create the environment with an RDS instance?
  • Error Docker deployment in Amazon EC2 - Docker container quit unexpectedly
  • Install docker 1.2 on Amazon Linux AMI release 2014.03
  • Docker Error - “jq: error: Cannot iterate over null”
  • First execution of Docker on a new EC2 Jenkins Slave does not work
  • sudo rabbitmqctl join_cluster rabbit@<hostnameOfParentMachine>

    With HA property set to all and the synchronization works. And a load balancer on top it with a DNS assigned. So far this thing works.

    Expected Implementation: Create an autoscaling clustered environment where the machines that go Up/Down has to join/Remove the cluster dynamically. What is the best way to achieve this? Please help.

  • How to deploy pipelines on user demand in AWS app
  • Unable to access jarfile - Docker on Elastic Beanstalk
  • docker-machine with amazon ami HVM doesn't work
  • AWS latency spike during fix hours
  • docker: 'network' is not a docker command
  • Dockerfile error with AWS Elastic Beanstalk; works otherwise, are there differences?
  • One Solution collect form web for “How to set up autoscaling RabbitMQ Cluster AWS”

    I had a similar configuration 2 years ago.

    I decided to use amazon VPC, by default my design had two RabbitMQ instances always running, and configured in cluster (called master-nodes).
    The rabbitmq cluster was behind an internal amazon load balancer.

    I created an AMI with RabbitMQ and management plug-in configured (called “master-AMI”), and then I configured the autoscaling rules.

    if an autoscaling alarm is raised a new master-AMI is launched.
    This AMI executes the follow script the first time is executed:

    #!/usr/bin/env python
    import json
    import urllib2,base64
    if __name__ == '__main__':
        prefix =''
        from subprocess import call
        call(["rabbitmqctl", "stop_app"])
        call(["rabbitmqctl", "reset"])
            _url = ''
            print prefix + 'Get json info from ..' + _url
            request = urllib2.Request(_url)
            base64string = base64.encodestring('%s:%s' % ('guest', 'guest')).replace('\n', '')
            request.add_header("Authorization", "Basic %s" % base64string)
            data = json.load(urllib2.urlopen(request))
            ##if the script got an error here you can assume that it's the first machine and then 
            ## exit without controll the error. Remember to add the new machine to the balancer
            print prefix + 'request ok... finding for running node'
            for r in data:
                if r.get('running'):
                    print prefix + 'found running node to bind..'
                    print prefix + 'node name: '+ r.get('name') +'- running:' + str(r.get('running'))
                    from subprocess import call
                    call(["rabbitmqctl", "join_cluster",r.get('name')])
        except Exception, e:
            print prefix + 'error during add node'
            from subprocess import call
            call(["rabbitmqctl", "start_app"])

    The scripts uses the HTTP API “” to find nodes, then choose one and binds the new AMI to the cluster.

    As HA policy I decided to use this:

    rabbitmqctl set_policy ha-two "^two\." ^

    Well, the join is “quite” easy, the problem is decide when you can remove the node from the cluster.

    You can’t remove a node based on autoscaling rule, because you can have messages to the queues that you have to consume.

    I decided to execute a script periodically running to the two master-node instances that:

    • checks the messages count through the API http://node:15672/api/queues
    • if the messages count for all queue is zero, I can remove the instance from the load balancer and then from the rabbitmq cluster.

    This is broadly what I did, hope it helps.

    Docker will be the best open platform for developers and sysadmins to build, ship, and run distributed applications.