Write a structured query (using spark-shell
or Databricks Community Edition) that computes multiple aggregations per group, i.e. the maximum and the minimum of id
column per group.
Module: Spark SQL
Duration: 15 mins
val nums = spark.range(5).withColumn("group", 'id % 2)
scala> nums.show
+---+-----+
| id|group|
+---+-----+
| 0| 0|
| 1| 1|
| 2| 0|
| 3| 1|
| 4| 0|
+---+-----+
+-----+------+------+
|group|max_id|min_id|
+-----+------+------+
| 0| 4| 0|
| 1| 3| 1|
+-----+------+------+