InternalRow — Abstract Binary Row Format¶
NOTE: InternalRow
is also called Catalyst row or Spark SQL row.
Note
UnsafeRow is a concrete InternalRow
.
[source, scala]¶
// The type of your business objects case class Person(id: Long, name: String)
// The encoder for Person objects import org.apache.spark.sql.Encoders val personEncoder = Encoders.product[Person]
// The expression encoder for Person objects import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder val personExprEncoder = personEncoder.asInstanceOf[ExpressionEncoder[Person]]
// Convert Person objects to InternalRow scala> val row = personExprEncoder.toRow(Person(0, "Jacek")) row: org.apache.spark.sql.catalyst.InternalRow = [0,0,1800000005,6b6563614a]
// How many fields are available in Person's InternalRow? scala> row.numFields res0: Int = 2
// Are there any NULLs in this InternalRow? scala> row.anyNull res1: Boolean = false
// You can create your own InternalRow objects import org.apache.spark.sql.catalyst.InternalRow
scala> val ir = InternalRow(5, "hello", (0, "nice")) ir: org.apache.spark.sql.catalyst.InternalRow = [5,hello,(0,nice)]
There are methods to create InternalRow
objects using the factory methods in the InternalRow
object.
[source, scala]¶
import org.apache.spark.sql.catalyst.InternalRow
scala> InternalRow.empty res0: org.apache.spark.sql.catalyst.InternalRow = [empty row]
scala> InternalRow(0, "string", (0, "pair")) res1: org.apache.spark.sql.catalyst.InternalRow = [0,string,(0,pair)]
scala> InternalRow.fromSeq(Seq(0, "string", (0, "pair"))) res2: org.apache.spark.sql.catalyst.InternalRow = [0,string,(0,pair)]
=== [[getString]] getString
Method
CAUTION: FIXME