ICP 8 - PallaviArikatla/Big-Data-Programming GitHub Wiki

INTRODUCTION:

Implementation of Spark programming.

SOFTWARE REQUIRED:

  • IntelliJ.
  • Spark.

IMPLEMENTATION:

Question 1: Write a spark program with an interesting use case using text data as the input and program should have at least Two Spark Transformations and Two Spark Actions. (Word Count)

  • Task handles two main actions, i.e., "map" and "flatMap".

  • Transformations performed in this task are top() and count().

  • Write any content in the text file. Text in this file gets analysed and word count will be displayed using count() function.

  • I have given my input text as " Hi this is Pallavi. Pallavi Arikatla. Pallavi in bdp."

  • Obtained word count output is as follows:

Question 2: Secondary Sorting in Map Reduce: Take any input of your interest and perform secondary sorting on it.

  • Split the input data and frame it using Map function.

  • Use groupByKey and map values with the temperature to convert it into array followed by sorting.

  • Given input is as follows:

  • Obtained output is as follows: