Lab 2 Transformations and Actions - npdarsini/Real-Time-Assignments GitHub Wiki

Welcome to the Transformations and Actions wiki!

09/07/2016

To understand the concepts of Transformations and Actions.

A sample program has been created to take an input of two files and to perform join operation on it.

map() Transformation:

I've applied the map() transformation on both the files named ratings and books to map the ID and ISBN from the ratings.txt and ISBN and Book title from books.txt file

split()

Split action is used to split the words based on ;

join()

Join operation has been performed on the two RDD files, and the join is performed based on ISBN.

saveAsTextFile() Action:

The results of ISBN, ratings and Book title will be saved to the Output directory.

take(n)

It takes the n number of rows and store it.

foreach() Actions:

Using foreach(), i have displayed the 50 numbers to the console.

Map Reduce Paradigm:

Output: