ICP 9 - PallaviArikatla/Big-Data-Programming GitHub Wiki
INTRODUCTION: To perform different algorithms and sorting techniques using Spark and Scala.
Software Required:
- 
Spark.
 - 
Intellij with scala plugin installed.
 
IMPLEMENTATION:
Question 1: K-Means Clustering Algorithm.
K-Means clustering helps in partitioning numerous similar observations and grouping them. Each observation groups to a cluster with the nearest mean.
- Here initially number of clusters are randomly selected which will be our K value.
 - Consider a dataset as input and select the input ranges.
 - Eliminate all the headers and using Kmeans cluster the data into classes.
 - Calculate mean square error and centroids of each clusters.
 
OUTPUT:
Question 2: Merge Sort.
- Write a method called merge sort with a given input list and make arrangement of the list with center element of the list as zero.
 - Divide the input list by 2 and consider the middle index number.
 - If the middle indexed number is zero then it returns the same input list.
 - After dividing the list merge sort method will be called and after this two sorts methods are combined as a single list.
 
OUTPUT:
Question 3: Depth First search.
- The Depth First search allows us to identify whether there is any path between one node and another.
 - Input will be given this way: 1 -> List(7,9), 7 -> List(1,8),8 -> List(7,9), 9 -> List(1,8)
 - The input starts with 1 and gets passed to DFS method.
 - This function goes to another node continuously unless it is a new node.
 - And the output will be as follows.
 
OUTPUT: