ICP2_2 - Hiresh12/Big-Data-Programming GitHub Wiki

Apache SparkII

Task:

To write map reduce program for merge sort and Depth First Search in Spark.

Features:

  • Spark
  • Map-Reduce
  • Scala
  • IntelliJ

Tasks:

  1. Merge Sort Algorithm

Merge Sort is a Divide and Conquer algorithm. It divides input array in two halves, calls itself for the two halves and then merges the two sorted halves. The merge () function is used for merging two halves. The merge (arr, l, m, r) is key process that assumes that arr[l..m] and arr[m+1..r] are sorted and merges the two sorted sub-arrays into one.

Merge Sort Implementation (User Defined Function)

Calling Merge Sort Method from Map

Merge Sort Output:

Merge Sort Explanation

  1. Depth-first search (DFS)

Depth-first search (DFS) is an algorithm for traversing or searching tree or graph data structures. The algorithm starts at the root node (selecting some arbitrary node as the root node in the case of a graph) and explores as far as possible along each branch before backtracking.

DFS Implementation

Calling DFS from Main method

DFS Input

DFS Output

DFS Explanation

References