Exercise: Running Spark Applications on Hadoop YARN
- Download Apache Hadoop
- Start a single-node YARN cluster
spark-submit
a Spark application to YARN
- Use the binary distribution of Spark with YARN support, i.e. “Pre-built for Apache Hadoop 2.7 and later”
- Use
run-example SparkPi
- Ensure that
HADOOP_CONF_DIR
or YARN_CONF_DIR
point to the directory with (client-side) configuration files for the Hadoop cluster
- Use YARN UI
Duration: 30 mins
Useful Links
- Running Spark on YARN