Code example by Boost - wuyichen24/boost GitHub Wiki
Assume we already have read the paragraph from the external text file and each element in the BoostList is one line in the paragraph.
BoostList<String> list = new BoostList<String>();
list.add("apple tree car king");
list.add("open apple window car");
list.add("high apple tall king team");
First, the flapMap function is to split each line into words, each element in the new BoostList is one word.
BoostList<String> newList = list.flatMap(new FlatMapFunction<String, String>() {
public Iterable<String> call(String s) {
return Arrays.asList(s.split(" "));
}
});
Second, the mapToPair function to mark each word as one occurrence. The BoostList is converted into BoostMap and each element in it is {word, 1}.
BoostMap<String, Integer> newMap = newList.mapToPair(new PairFunction<String, String, Integer>() {
public BoostPair<String, Integer> call(String s) {
return new BoostPair<String, Integer>(s, 1);
}
});
Finally, the reduceByKey function is to accumulate the total number of occurrences of each unique word.
BoostMap<String, Integer> resultMap = newMap.reduceByKey(new Function2<Integer, Integer, Integer>() {
public Integer call(Integer i1, Integer i2) {
return i1 + i2;
}
});
Print the result.
for (Entry<String, Integer> entry : resultMap.entries()) {
System.out.println(entry.getKey() + " - " + entry.getValue());
}
Result:
apple - 3
tree - 1
car - 2
king - 2
open - 1
window - 1
high - 1
tall - 1
team - 1