ICP 9 - Gnkhakimova/CS5590-BigData GitHub Wiki
ICP 9
Tasks
- Perform Merge sort using Spark
- Perform DFS
Configuration
- Linux Mint
- IntelliJ
- Apache Spark
Features
In this ICP 8 we used IntelliJ IDE to complete task, we had to perform merge sort by defining our own functions, also we implemented DFS for graph.
Merge Sort
For merge sort we created a list of unsorted integers, Parallelized it using RDD. Created two functions which will sort and merge a list and called it using RDD.
Output - Sorted List:
Depth First Search
For DFS we had a graph and we had to perform DFS by visiting each node. DFS - starts at the root node (selecting some arbitrary node as the root node in the case of a graph) and explores as far as possible along each branch before backtracking.
Output - Visited Nodes and their order:
Limitations
- Had to do more research on RDD and how to pass a function.