Skip to content

Day 2 / May 10 (Tue)

Introduction to Hadoop YARN

Read the following documents. Get familiar with the basics.

  1. Architecture
  2. Commands
  3. web services REST API
  4. (optional) Writing YARN Applications

Exercise: Spark on YARN

  1. Read Running Spark on YARN
  2. Use the Spark SQL application that you created yesterday (that loads CSV files from a HDFS directory) and deploy it to your local Hadoop YARN cluster

Code Review

  1. https://github.com/JKulczynski/Docker-CommandLine-App
  2. https://github.com/rafalkac02/directory-traverser

(optional) Exercise: Spark on YARN on Docker

  1. Read Launching Applications Using Docker Containers
  2. Deploy the Spark SQL application to the Hadoop YARN cluster on Docker
Back to top