Best way to to implement Spark + AWS + Caffe/CUDA?
I am looking to deploy an application that already has a trained caffemodel file and I need to deploy it to a Spark cluster on AWS for processing due to GPU computation power needed (20K patches per image). From my research it seems that the best way to do it is to use Spark to create an AWS cluster which then runs a Docker image or Amazon AMI to install project dependencies automatically. Once everything is installed, the job can run in the cluster through Spark. What I am wondering is how to do this from start to finish. I have seen several guides, and have taken some online courses on Spark (BerkeleyX, Udemy) and Docker (Udemy); however almost all the information I have seen are examples of how to implement the simplest application that has little to no heavy software dependencies (CUDA drivers, CuDNN, Caffe, DIGITS). I have deployed Spark clusters on AWS and ran simple examples that had no dependencies, but have found little to no information on running an application that would require even a small dependency such as numpy. I would like to leverage the group to see if anyone has experience in such an implementation and can point me in the right direction or offer some help/suggestions?
Here are some things I have looked into:
bitfusion AMI: https://aws.amazon.com/marketplace/pp/B01DJ93C7Q/ref=sp_mpg_product_title?ie=UTF8&sr=0-13
My question is in regards to how to implement a small sample application from start to end with the Spark cluster getting created automatically side-by-side while installing the dependencies needed through either Docker or an AMI from above?
Platform: Ubuntu 14.04
Dependencies: CUDA 7.5, caffenv, libcudnn4, NVIDIA Graphics Driver (346-352)