Write a structured query that limits collect_set standard function.
Module: Spark SQL
Duration: 15 mins
val input = spark.range(50).withColumn("key", $"id" % 5)
scala> input.show
+---+---+
| id|key|
+---+---+
| 0| 0|
| 1| 1|
| 2| 2|
| 3| 3|
| 4| 4|
| 5| 0|
| 6| 1|
| 7| 2|
| 8| 3|
| 9| 4|
| 10| 0|
| 11| 1|
| 12| 2|
| 13| 3|
| 14| 4|
| 15| 0|
| 16| 1|
| 17| 2|
| 18| 3|
| 19| 4|
+---+---+
only showing top 20 rows
scala> solution.show(truncate = false)
+---+--------------------------------------+----------------+
|key|all |only_first_three|
+---+--------------------------------------+----------------+
|0 |[0, 15, 30, 45, 5, 20, 35, 10, 25, 40]|[0, 15, 30] |
|1 |[1, 16, 31, 46, 6, 21, 36, 11, 26, 41]|[1, 16, 31] |
|3 |[33, 48, 13, 38, 3, 18, 28, 43, 8, 23]|[33, 48, 13] |
|2 |[12, 27, 37, 2, 17, 32, 42, 7, 22, 47]|[12, 27, 37] |
|4 |[9, 19, 34, 49, 24, 39, 4, 14, 29, 44]|[9, 19, 34] |
+---+--------------------------------------+----------------+