ICP 2 - a190884810/Big-Data-Programming GitHub Wiki
Lesson 2 PLan, MapReduce
Counting the frequency of words in the given input with MapReduce algorithm
##The map function for Q1,Q2 and bonus question
- For Q1, simply took each word as a token by calling the nexttoken api fucntion, and then pass the token as a key into the map.
- For Q2, filter the word using startswith function to get all initial with "a" words and then pass them into the map
- For Q3, build a self define function called isNum to take all number token, in the reduce stage, we will build a isPrime function to filter all prime numbers and them output the result.
The reduce stage for the program
- Contain a normal recude function code, which can sum up all occurence of those keys. And a modified function called isPrime which is used to filter all prime number for adding up.
Result for Q1
- Output all word count
Result for Q2
- Output all word start with "A" or "a" small change but it does not matter.
Bonus question Result ouput
- I change the input stream, add three prime numbers and one non-prime number to test if the program ran properly.