ICP 3 - Gnkhakimova/CS5590-BigData GitHub Wiki

ICP 3

Hadoop Matrix Multilication and Hadoop Distributed File System (HDFS)

Source Code

Tasks

  1. Multiply two matrices using Map and reduce function
  2. Execute using HDFS/Hadoop

Configuration

  1. Oracle Virtual Box
  2. Cloudera
  3. IntelliJ Idea

Features

For this task we had to create map and reduce function which would perform matrix multiplication and output result to separate file. map function will read input file and reduce function will perform multiplication.
1. Input files
Download two input files with 3x3 matrix values. Uploaded input file to the system and called Map and Reduce function on it.


2. Implementation - Part 1
Ran Matrix Multiply class which reads input files, passes input values to Map function which parses input vales and then calls Reduce function which perform multiplication.

3. Output
Output file is a result of multiplied matrices which is 3x3 matrix as well.

Limitation

  1. Had issue with Java version which was conflicting with IntelliJ Idea IDE.

Reference

  1. https://lendap.wordpress.com/2015/02/16/matrix-multiplication-with-mapreduce/
  2. https://www.jetbrains.com/help/idea/creating-and-running-your-first-java-application.html