Skip to content

InternalRow — Abstract Binary Row Format

NOTE: InternalRow is also called Catalyst row or Spark SQL row.

Note

UnsafeRow is a concrete InternalRow.

[source, scala]

// The type of your business objects case class Person(id: Long, name: String)

// The encoder for Person objects import org.apache.spark.sql.Encoders val personEncoder = Encoders.product[Person]

// The expression encoder for Person objects import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder val personExprEncoder = personEncoder.asInstanceOf[ExpressionEncoder[Person]]

// Convert Person objects to InternalRow scala> val row = personExprEncoder.toRow(Person(0, "Jacek")) row: org.apache.spark.sql.catalyst.InternalRow = [0,0,1800000005,6b6563614a]

// How many fields are available in Person's InternalRow? scala> row.numFields res0: Int = 2

// Are there any NULLs in this InternalRow? scala> row.anyNull res1: Boolean = false

// You can create your own InternalRow objects import org.apache.spark.sql.catalyst.InternalRow

scala> val ir = InternalRow(5, "hello", (0, "nice")) ir: org.apache.spark.sql.catalyst.InternalRow = [5,hello,(0,nice)]


There are methods to create InternalRow objects using the factory methods in the InternalRow object.

[source, scala]

import org.apache.spark.sql.catalyst.InternalRow

scala> InternalRow.empty res0: org.apache.spark.sql.catalyst.InternalRow = [empty row]

scala> InternalRow(0, "string", (0, "pair")) res1: org.apache.spark.sql.catalyst.InternalRow = [0,string,(0,pair)]

scala> InternalRow.fromSeq(Seq(0, "string", (0, "pair"))) res2: org.apache.spark.sql.catalyst.InternalRow = [0,string,(0,pair)]


=== [[getString]] getString Method

CAUTION: FIXME


Last update: 2020-10-21