ICP 2 - a190884810/Big-Data-Programming GitHub Wiki

Lesson 2 PLan, MapReduce

Counting the frequency of words in the given input with MapReduce algorithm

##The map function for Q1,Q2 and bonus question

  • For Q1, simply took each word as a token by calling the nexttoken api fucntion, and then pass the token as a key into the map.
  • For Q2, filter the word using startswith function to get all initial with "a" words and then pass them into the map
  • For Q3, build a self define function called isNum to take all number token, in the reduce stage, we will build a isPrime function to filter all prime numbers and them output the result.

The reduce stage for the program

  • Contain a normal recude function code, which can sum up all occurence of those keys. And a modified function called isPrime which is used to filter all prime number for adding up.

Result for Q1

  • Output all word count

Result for Q2

  • Output all word start with "A" or "a" small change but it does not matter.

Bonus question Result ouput

  • I change the input stream, add three prime numbers and one non-prime number to test if the program ran properly.