ICP 3 - manaswinivedula/Big-Data-Programming GitHub Wiki

Matrix multiplication using Hadoop Map reduce

Manaswini Vedula ClassID-6

2 input matrices M and N are placed in the Input folder of Cloudera.

matrix 1:

Input is given in the format (M, i, j, M_ij_) where i is the row, j is the column and M_ij_ is the value for ith row and jth column.

matrix 2:

Input is given in the format (N, j, k, N_jk_) where j is a row, k is column and N_jk_ is the value for jth row and kth column.

Mapper class:

Map function generates key value pairs for each (i,k) with values (M,j,M_ij_) and (N,j,N_jk_) for all values of j.

Reducer class:

Reduce function takes each (i,k) pair, sorts the values starting with M in matrix 1, and sorts values starting with R in matrix 2. It then multiplies M_ij_ with M_jk_ and finally calculates their sum.

main class:

It is responsible for the control flow and execution of map and reducer classes.

execution commands:

Executing the MatrixMultiply.jar file by specifying the paths of input and output folders using the following commands.

Output folder: This is the output folder in hue

Output: This is the product of two matrices.

References: https://lendap.wordpress.com/2015/02/16/matrix-multiplication-with-mapreduce/ https://github.com/autopear/Intellij-Hadoop/blob/master/README.md