Write a structured query that calculates total and average salaries by department and company-wide (using rollup multi-dimentional aggregation operator).
Module: Spark SQL
Duration: 15 mins
id,name,department,salary
1,Hunter Fields,IT,15
2,Leonard Lewis,Support,81
3,Jason Dawson,Support,90
4,Andre Grant,Support,25
5,Earl Walton,IT,40
6,Alan Hanson,IT,24
7,Clyde Matthews,Support,31
8,Josephine Leonard,Support,1
9,Owen Boone,HR,27
10,Max McBride,IT,75
val salaries = spark
.read
.option("header", true)
.option("inferSchema", true)
.csv("salaries.csv")
scala> salaries.show
+---+-----------------+----------+------+
| id| name|department|salary|
+---+-----------------+----------+------+
| 1| Hunter Fields| IT| 15|
| 2| Leonard Lewis| Support| 81|
| 3| Jason Dawson| Support| 90|
| 4| Andre Grant| Support| 25|
| 5| Earl Walton| IT| 40|
| 6| Alan Hanson| IT| 24|
| 7| Clyde Matthews| Support| 31|
| 8|Josephine Leonard| Support| 1|
| 9| Owen Boone| HR| 27|
| 10| Max McBride| IT| 75|
+---+-----------------+----------+------+
NOTE: Use Online Generate Test Data for more sophisticated datasets in CSV or JSON format.
+----------+---+----+
|department|sum| avg|
+----------+---+----+
| null|409|40.9|
| Support|228|45.6|
| IT|154|38.5|
| HR| 27|27.0|
+----------+---+----+