Spark standalone cluster on docker in network “bridge”

My problem is for the connection between slaves from other node to the master.
I have 3 nodes setup as follow :

  • 1 node with the master and 1 worker launched on the same docker
  • 2 node with 1 worker each on docker

The docker-compose open theses ports :

  • Connecting zeppelin docker with hive
  • Docker application support in Hadoop YARN
  • Docker container IP's not communicating in kubernetes cluster
  • Communication with Spark using Spark JobServer in docker
  • What is the difference between exposing docker port and creating bridge(or overlay)?
  • YUM lock while running multiple Docker containers on a single host
  • version: '2'
        image: xxxxxxxx/spark
        tty: true
        stdin_open: true
        container_name: spark
         - /var/data/dockerSpark/:/var/data
     - "7077:7077"
     - ""
     - "7078:7078"
     - ""
     - ""
     - "4040:4040"
     - "18080:18080"
     - "6066:6066"
     - "9000:9000"

    The conf/ is as follow :

     #export STANDALONE_SPARK_MASTER_HOST=172.xx.xx.xx #This is the docker Ip adress on the node
     export SPARK_WORKER_MEMORY=7g
     export SPARK_WORKER_CORES=4
     export SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true -Dspark.worker.cleanup.interval=86400 -Dspark.worker.cleanup.appDataTtl=86400"

    My problem is for the connection between slaves from other node to the master, so i begin by starting master sbin/
    During my first attempts the 2 first lines was commented and the master started at this adress spark://c96____37fb:7077.
    I connected succesfully nodes using theses commands :

    • sbin/ spark://c96____37fb:7077 –port 7078 for the collocated slave
    • sbin/ spark://masterNodeIP:7077 –port 7078 for the two others slaves

    All the port cited previously are redirected from nodeMaster to the corresponding docker.

    So the webUI show me that my cluster had 3 connected nodes, unfortunately when it comes to run, only the collocated nodes was working, the two others continuously disconnect and reconnect to the application without doing anything.

    Next i tried to change STANDALONE_SPARK_MASTER_HOST=172.xx.xx.xx to the value of 1 the nodeMasterIP but the master doesn’t started and 2 by the address which is the docker ip adress inside masterNode. The 2nd attempt works and the webUi shows me the follow adress spark://172.xx.xx.xx:7077.
    Then the slaves connected succesfully but again the two external slaves do not show any sign of activity.


    Spark SPARK_PUBLIC_DNS and SPARK_LOCAL_IP on stand-alone cluster with docker containers gives me a part of the answear but not the one i want because by adding network_mode: “host” to the docker-compose.yml i succeed to build my cluster at STANDALONE_SPARK_MASTER_HOST=ipNodeMaster and connect slaves to it. Execution was OK but stopped at a collect operation with this error org.apache.spark.shuffle.FetchFailedException: Failed to connect to xxx/yy.yy.yy.yy:36801 which seems to be a port issue.

    But my real concern is that i don’t want to run the spark master docker on the localhost of the masterNode but on its own docker network (“bridge”).

    Thank you for your wises advices !

  • How to load balance containers?
  • NPM install in Vagrant shared folder leads to filesystem issues
  • Docker Wildfly Image with 32 bit Java
  • I have multiple flask microservices that all communicate with each other, how would I configure docker?
  • How to link from docker-compose to Amazon RDS
  • Jenkinsfile Pipeline do something when some checked-in file is changed / newly checked out / run in a fresh node
  • Docker will be the best open platform for developers and sysadmins to build, ship, and run distributed applications.