How to deploy Spark,That it can make the highest resource utilization

I have 10 servers (16G memory, 8 cores) and want to deploy Hadoop and Spark, can you tell me which plan can make the maximum utilization of resources?

  1. the immediate deployment;

  2. How to run Spark on Docker?
  3. java.lang.StackOverflowError in pio train phase
  4. Connecting to docker container using “container ip”
  5. Spark cluster mode on Mesos: how to pass configuration to executor?
  6. How to run Apache Spark 2.1 Driver Program in docker container with bridge mode
  7. Docker Container with Apache Spark in standalone cluster mode
  8. install Openstack, deploy the environment into virtual machine;

  9. using Docker, such as Spark on Docker;

I know resource utilization associated with usage scenario, actually I want to know the advantages and disadvantages of the three plans above.

Thank you.

  • Using dotnet from docker to power Visual Studio C# extension (OmniSharp)
  • Docker Swarm: difference between --swarm-discovery and cluster-store
  • seafile docker - how to persist config/make container disposable
  • Why does “docker run” error with “no such file or directory”?
  • How to run a docker image in jenkins which is already in Docker Container
  • Can you generate and apply patches to a docker container offline?
  • One Solution collect form web for “How to deploy Spark,That it can make the highest resource utilization”

    For highest resource utilization, deploying a single resource manager for both Spark and Hadoop will be a best way to go. There are two options for that:

    • Deploying Hadoop cluster using YARN since Spark can run on YARN.
    • Deploying Apache Mesos cluster, and run Hadoop job and Spark on it.

    Isolating Spark cluster and Hadoop cluster provides no advantage over this, and will cause higher overhead and lower resource utilization.

    Docker will be the best open platform for developers and sysadmins to build, ship, and run distributed applications.