Spark Application Programming - salmanbaig8/imp GitHub Wiki

URL:https://courses.cognitiveclass.ai/courses/course-v1:BigDataUniversity+BD0211EN+2016/courseware/dd62d6844b714d279e5945bb7fe44459/641ba0e6e6c1452d955463f6043b5807/ Spark Context: Main entry point for spark func represents the connecn to spark cluster creates RDDS, accumulators, and broadcast variabls on that cluster

Linking Spark with Java: spark 1.1.1

Programming the business logic: Create RDD from an external dataset or an existing RDD Transformations and actions to process data RDD persistense to improve performance use broadcast variables or accumulators for specific use cases