Write a structured query (using spark-shell
or Databricks Community Edition) that creates as many rows as the number of elements in a given array column. The values of the new rows should be the elements of the array column themselves.
Protip™: Use Dataset.flatMap operator
Module: Spark SQL
Duration: 30 mins
val nums = Seq(Seq(1,2,3)).toDF("nums")
scala> nums.printSchema
root
|-- nums: array (nullable = true)
| |-- element: integer (containsNull = false)
scala> nums.show
+---------+
| nums|
+---------+
|[1, 2, 3]|
+---------+
+---------+---+
| nums|num|
+---------+---+
|[1, 2, 3]| 1|
|[1, 2, 3]| 2|
|[1, 2, 3]| 3|
+---------+---+
Please note that the output has two columns (not one!)