SPARK ICP 1 - Apoorvag2597/BDP_Revised GitHub Wiki

Name :Apoorva Geetanjali Avadhanula

class id:34

1.Write a spark program with an interesting use case using text data as the input and program should have at least Two Spark Transformations and Two Spark Actions.

The transformations we used for this program are:

flatmap() - This is a one to many transformation. This will convert the line into words in our program. Now each word is treated as an individual element.

map() - This is a one to one transformation. This function is performed according to the condition. In this, we used the function to count the number of times the word is repeated.

The two actions we used for this program are

foreach() - This is an action which does not return any value, and is being applied on all the elements. Here this function is used to store the output.

take() - This returns an array of elements. Here this function is used to return the output.

Here an input text file is considered as input and is further split into words on the " " space.

Program:

Output -

  1. Secondary Sorting

Secondary Sorting in Map Reduce

Secondary sorting is used to sort the values in the reducer phase.

Output-