Dockerized Apache Spark 2 that can run as a cluster.
If building on a local system - build using docker build -t univizor/spark-docker .
after building spark-base image.
- Start master and expose web GUI to localhost on 8080:
docker run --rm -it -p 8081:8081 --name spark-master univizor/spark-docker start-master.sh
- Start slave and expose web GUI to localhost on 8081:
docker run --rm -it -p 8081:8081 --name spark-slave --link spark-master univizor/spark-docker start-slave.sh spark://spark-master:7077 -c 4 -m 4G
- Run a testing PI calculation from the image on a just deployed cluster. Watch localhost port 4040 for job details:
docker run --rm -it --link spark-master -p 4040:4040 univizor/spark-docker 'spark-submit --class org.apache.spark.examples.SparkPi --master spark://spark-master:7077 --executor-memory 1024MB --total-executor-cores 2 /home/spark/spark/examples/jars/spark-examples_2.11-2.1.0.jar 1000'
- Alternatively, you can
docker exec -it spark-master /bin/bash
and run PI caclulation from there:
spark@26ca5e514ff6:/srv/data$ spark-submit --class org.apache.spark.examples.SparkPi --master spark://$HOSTNAME:7077 --executor-memory 512MB --total-executor-cores 1 $SPARK_HOME/examples/jars/ 10