ICP 3 - PallaviArikatla/Big-Data-Programming GitHub Wiki
BIG DATA PROGRAMMING
INTRODUCTION:
To perform Matrix multiplication using Hadoop map reduce.
SOFTWARE REQUIRED:
Virtual Box, Cloudera VM
IMPLEMENTATION:
- Create a java project named "MatrixMul" and then import all the required external jars.
- For Input create a folder input creating 2 files M.txt and N.txt both are matrices of order (2,2).
- Later create 3 java classes MatrixMul.java, Map.java and Reduce.java and add the code.
Map Reduce:
- The map function takes the 2 input matrices as input M,N and it separates line by line with a ','
- This multiplication loop run until it goes through every element in the matrix.
- The reducer takes the mapper outputs as input and puts the values of M and N into the corresponding Hashmaps.
- Again it Loops through the each value until the size of matrix N, check for index value and multiplies it, add it to the result throughout the loop.
-
Now to run the MatrixMul.java in the intellij and the output file is created and you can see the matrix multiplication output: Output in Intellij is shown as follows:
-
For Hue visualization, firstly create input folder Hadoop fs -mkdir input
-
Using PUT command we can move our input folder in the local to hdfs as Hadoop fs -put /home/cloudera/desktop/input /user/cloudera
-
Now we have export the jar file and execute that jar file using Hadoop jar /home/cloudera/MatrixMultiply.jar MatrixMul /user/cloudera/input /user/cloudera/output
Hue Inputs:
Martix M:
Matrix N: