Spark config parameters - animeshtrivedi/notes GitHub Wiki
spark.storage.unrollMemoryThreshold
in MemoryStore.scala
// Initial memory to request before unrolling any block
private val unrollMemoryThreshold: Long =
conf.getLong("spark.storage.unrollMemoryThreshold", 1024 * 1024)
Used to see if when an iterator is put into the memory store, should they consume and materialize the value it present - up to what size. In functions like private[storage] def putIteratorAsBytes[T]
and private[storage] def putIteratorAsValue[T]
.
/**
* Attempt to put the given block in memory store as bytes.
*
* It's possible that the iterator is too large to materialize and store in memory. To avoid
* OOM exceptions, this method will gradually unroll the iterator while periodically checking
* whether there is enough free memory. If the block is successfully materialized, then the
* temporary unroll memory used during the materialization is "transferred" to storage memory,
* so we won't acquire more memory than is actually needed to store the block.
*
* @return in case of success, the estimated size of the stored data. In case of failure,
* return a handle which allows the caller to either finish the serialization by
* spilling to disk or to deserialize the partially-serialized block and reconstruct
* the original input iterator. The caller must either fully consume this result
* iterator or call `discard()` on it in order to free the storage memory consumed by the
* partially-unrolled block.
*/
private[storage] def putIteratorAsBytes[T](