Day 9 / Apr 14 (Thu)¶
Back to Scala with a bit of Spark SQL.
The following is a list of exercises to help you hone your skills in Scala (with some Spark SQL). You are supposed to do the exercises alone. In the end, push your projects to Github.
There is no ordering. There is no need to do them all. You can pick whatever exercises you like in any order.
Use slack to ask questions. You can DM me directly or use #scala-academy channel. Up to your liking.
Enjoy!
Exercises¶
These exercises are about Spark SQL.
- Selecting the most important rows per assigned priority
- Exercise: Reverse-engineering Dataset.show Output
- Exercise: Specifying Table and SQL Query on Command Line
Scala Project: Node¶
Write a class Node that can have zero, one or more Node children. The class should support adding a child Node, removing and listing them.
The most challenging part is display method that should display a Node with all children (that in turn may have Node children that are supposed to be displayed, too).
A sample display could look like the following:
AdaptiveSparkPlan
+- Union
:- HashAggregate_1
: +- Exchange
: +- HashAggregate
: +- Project
: +- Range
+- HashAggregate_2
+- Exchange
+- HashAggregate
+- Project
+- Range
The above shows a Node (called AdaptiveSparkPlan) with one child (Union) that has two children HashAggregate_1 and HashAggregate_2 and so on.
Write unit tests.