Cache Vs Persist - ignacio-alorre/Spark GitHub Wiki

With cache, you use only the default storage level:

MEMORY_ONLY for RDD
MEMORY_AND_DISK for Dataset

With persist() you can specify which storage level you want for both RDD and Dataset

MEMORY_ONLY
MEMORY_ONLY_SER
MEMORY_AND_DISK
MEMORY_AND_DISK_SER
DISK_ONLY

Complete with:

https://sparkbyexamples.com/spark/spark-difference-between-cache-and-persist/#:~:text=Spark%20Cache%20vs%20Persist&text=Both%20caching%20and%20persisting%20are,to%20user%2Ddefined%20storage%20level.