ICP 8 - PallaviArikatla/Big-Data-Programming GitHub Wiki
INTRODUCTION:
Implementation of Spark programming.
SOFTWARE REQUIRED:
- IntelliJ.
- Spark.
IMPLEMENTATION:
Question 1: Write a spark program with an interesting use case using text data as the input and program should have at least Two Spark Transformations and Two Spark Actions. (Word Count)
-
Task handles two main actions, i.e., "map" and "flatMap".
-
Transformations performed in this task are top() and count().
-
Write any content in the text file. Text in this file gets analysed and word count will be displayed using count() function.
-
I have given my input text as " Hi this is Pallavi. Pallavi Arikatla. Pallavi in bdp."
-
Obtained word count output is as follows:
Question 2: Secondary Sorting in Map Reduce: Take any input of your interest and perform secondary sorting on it.
-
Split the input data and frame it using Map function.
-
Use groupByKey and map values with the temperature to convert it into array followed by sorting.
-
Given input is as follows:
-
Obtained output is as follows: