ICP 3 WIKI - navyagonug/CS5590-BIG-DATA-PROGRAMMING-USING-HADOOP-AND-SPARK GitHub Wiki

PROBLEM STATEMENT

Create a Map-Reduce Program to perform the task of matrix multiplication.

FEATURES

Technologies used are Intellij IDE(Maven), Cloudera, Virtual Box, Java. This in-class programming includes performing a matrix multiplication on two matrices stored as individual files. The result is finally stored in a different file. Input and output format of files are shown below in screenshots.

CONFIGURATIONS

The **pom.xml ** file is modified in Intellij IDE. The xml file is as follows.

4.0.0

gid aid 1.0-SNAPSHOT

apache http://maven.apache.org org.apache.hadoop hadoop-core 1.2.1 org.apache.hadoop hadoop-common 3.2.0

APPROACH

The input files given are two(The names of files are m.txt and n.txt.) The screenshots for this are as follows

The matrices are m and n present in given input files. The matrix m tuples are (i,j,mij) and the matrix n tuples are (j,k,njk). In Mapper phase key value pairs are formed as (i,k),(m,j,mij). These key-value pairs are passed to Reducer phase wherein this phase processes each key at a time. For each key it divides the values in two separate lists for m and n.

The output file is as follows

ICP 3 WIKI - navyagonug/CS5590-BIG-DATA-PROGRAMMING-USING-HADOOP-AND-SPARK GitHub Wiki

PROBLEM STATEMENT

FEATURES

CONFIGURATIONS

APPROACH

REFERENCES

⚠️ GitHub.com Fallback ⚠️

ICP 3 WIKI - navyagonug/CS5590-BIG-DATA-PROGRAMMING-USING-HADOOP-AND-SPARK GitHub Wiki

PROBLEM STATEMENT

FEATURES

CONFIGURATIONS

APPROACH

REFERENCES

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️