ICP Assignment 2 - MadhuriSarode/BDP GitHub Wiki

Student ID 26 : Madhuri Sarode

Student ID 12 : Chennupati,Bhargavi Saipoojitha

Student ID Bhavana Deepti

Mapreduce Programs

IntelliJ is installed and Source code, sample input files are downloaded into Virtual machine folder.

  1. Running the map reduce program to find the word count

The WordCount project is imported into IntelliJ workspace as a maven project

The Project is built using maven commands from the working directory

The input file is placed on HDFS file system in the following location

The jar file after maven install would have been created the in the target directory. Using the hadoop commands, the jar is run to execute the WordCount class.

The Mapreduce job is executed and it can be visualized in the ULR stated during the execution of the job

The Output directory is created.

The WordCount is successfully accomplished by the mapreduce job

  1. The second program count the only words from the sample text file which starts with 'a'.

The appropriate code changes are as follows

The input is in the same path, the sample text file. Maven build and install is done with the following commands.

maven clean install

Once the new jar is created, it is run using the hadoop command and the following is the output which has a file that only counted and recorded the words starting with 'a'.