Day 9 / Apr 14 (Thu)¶
Back to Scala with a bit of Spark SQL.
The following is a list of exercises to help you hone your skills in Scala (with some Spark SQL). You are supposed to do the exercises alone. In the end, push your projects to Github.
There is no ordering. There is no need to do them all. You can pick whatever exercises you like in any order.
Use slack to ask questions. You can DM me directly or use #scala-academy
channel. Up to your liking.
Enjoy!
Exercises¶
These exercises are about Spark SQL.
- Selecting the most important rows per assigned priority
- Exercise: Reverse-engineering Dataset.show Output
- Exercise: Specifying Table and SQL Query on Command Line
Scala Project: Node¶
Write a class Node
that can have zero, one or more Node
children. The class should support adding a child Node
, removing and listing them.
The most challenging part is display
method that should display a Node
with all children (that in turn may have Node
children that are supposed to be displayed, too).
A sample display could look like the following:
AdaptiveSparkPlan
+- Union
:- HashAggregate_1
: +- Exchange
: +- HashAggregate
: +- Project
: +- Range
+- HashAggregate_2
+- Exchange
+- HashAggregate
+- Project
+- Range
The above shows a Node
(called AdaptiveSparkPlan
) with one child (Union
) that has two children HashAggregate_1
and HashAggregate_2
and so on.
Write unit tests.