Lab Assignment 2 - rashmitripathi/Big_Data_Analytics_And_Apps GitHub Wiki
1.Spark Programming:
## Write a spark program with an interesting use case using text data as the input and program should have at least Two Spark Transformations and Two Spark Actions. Present your use case in map reduce paradigm as shown below (for word count)
Transformation:
In spark Transformations represent references. There are many Transformation functions available in spark : map(), filter(), flatmap(), union(), Intersection(), distinct(), groupByKey(), join(), cogroup(), repartition().
Action:
Action represent values. There are many Action functions: reduce(), collect(), count(), saveAsTextFile(), countByKey(), foreach()
Description:
As part of this program I have first created a flatmap , then converted to lowercase and then done mapping to word size.
Further collected distinct words and then collected it as list in sorted way.
Actions used:collect(), foreach()
Transformation used: map(), flatmap(),distinct(),sortBy()
Sample Input:
Output:
Map reduce paradigm :