Big_Data_Programming_ICP_3 - kusamdinesh/Big-Data-and-Hadoop GitHub Wiki

Lesson 3: Advanced MapReduce Algorithm

In today's class, we have learned how to implement matrix multiplication algorithm using MapReduce. We're writing a MapReduce java program for implementing matrix multiplication for 2 by 2 matrix. We're using a text file as the input of M, N matrices and using hashmaps for creating the key-value pairs.

INPUT

Input file is given in the below format where M,N are matrices and the input file is put in Hadoop file system using hadoop fs -put /home/cloudera/matrix.txt input/ and displayed in HUE

MAPPER

we have split the input file into (key,value) key has (i,k) and value has (matrix name, j, value)

REDUCER

we get the input from mapper function and it is converted into output (key,value) pair representing the output of the multiplexed matrix

Main function

Execution

hadoop jar files are imported into this java package. After writing the MatrixMultiplex.java have the Mapper, Reducer and Main class. The package in then exported to a java jar file which is used for execution. "hadoop jar MatrixMultiplex.jar MatrixMultiplex input/matrix.txt MatrixOP1" command is used to run the mapreduce function

OUTPUT

The output matrix in then displayed in HUE