spark-workshop

Exercise: Converting Arrays of Strings to String

Based on How to convert column of arrays of strings to strings?.

Module: Spark SQL

Input Dataset

val words = Seq(Array("hello", "world")).toDF("words")
scala> words.show
+--------------+
|         words|
+--------------+
|[hello, world]|
+--------------+

scala> words.printSchema
root
 |-- words: array (nullable = true)
 |    |-- element: string (containsNull = true)

Expected Dataset

scala> solution.show
+--------------+-----------+
|         words|   solution|
+--------------+-----------+
|[hello, world]|hello world|
+--------------+-----------+

scala> solution.printSchema
root
 |-- words: array (nullable = true)
 |    |-- element: string (containsNull = true)
 |-- solution: string (nullable = false)

Duration: 15 mins

Protips

  1. Use concat_ws function
    • Concatenates multiple input string columns together into a single string column, using the given separator