ICP 3 - PallaviArikatla/Big-Data-Programming GitHub Wiki

BIG DATA PROGRAMMING

INTRODUCTION:

To perform Matrix multiplication using Hadoop map reduce.

SOFTWARE REQUIRED:

Virtual Box, Cloudera VM

IMPLEMENTATION:

  1. Create a java project named "MatrixMul" and then import all the required external jars.
  2. For Input create a folder input creating 2 files M.txt and N.txt both are matrices of order (2,2).
  3. Later create 3 java classes MatrixMul.java, Map.java and Reduce.java and add the code.

Map Reduce:

  • The map function takes the 2 input matrices as input M,N and it separates line by line with a ','
  • This multiplication loop run until it goes through every element in the matrix.
  • The reducer takes the mapper outputs as input and puts the values of M and N into the corresponding Hashmaps.
  • Again it Loops through the each value until the size of matrix N, check for index value and multiplies it, add it to the result throughout the loop.
  1. Now to run the MatrixMul.java in the intellij and the output file is created and you can see the matrix multiplication output: Output in Intellij is shown as follows:

  2. For Hue visualization, firstly create input folder Hadoop fs -mkdir input

  3. Using PUT command we can move our input folder in the local to hdfs as Hadoop fs -put /home/cloudera/desktop/input /user/cloudera

  4. Now we have export the jar file and execute that jar file using Hadoop jar /home/cloudera/MatrixMultiply.jar MatrixMul /user/cloudera/input /user/cloudera/output

Hue Inputs:

Martix M:

Matrix N:

# Hue Visualized Output: