DataSource API
Apache Spark 2 / Spark SQL
Agenda
- DataSource API Overview
- Input and Output
- File Formats (JSON, TEXT, CSV, Parquet, ORC)
- Hive Tables
- JDBC
- Using DataSource API in Scala
- Reading data from PostgreSQL (JDBC)
- Running ML experiment with JSON data source