Skip to content

InMemoryFileIndex

InMemoryFileIndex is a PartitioningAwareFileIndex.

Creating Instance

InMemoryFileIndex takes the following to be created:

  • SparkSession
  • Root Paths (as Hadoop Paths)
  • Parameters (Map[String, String])
  • User-Defined Schema (Option[StructType])
  • FileStatusCache (default: NoopCache)
  • User-Defined Partition Spec (default: undefined)
  • metadataOpsTimeNs (Option[Long], default: undefined)

While being created, InMemoryFileIndex refresh0.

InMemoryFileIndex is created when:

Refreshing Cached File Listings

refresh(): Unit

refresh requests the FileStatusCache to invalidateAll and then refresh0.

refresh is part of the FileIndex abstraction.

Refreshing Cached File Listings (Internal)

refresh0(): Unit

refresh0...FIXME

refresh0 is used when InMemoryFileIndex is created and requested to refresh.

Root Paths

rootPaths: Seq[Path]

The root paths with streaming metadata directories and files filtered out (e.g. _spark_metadata streaming metadata directories).

rootPaths is part of the FileIndex abstraction.

Back to top