M2 ICP6 - bhargavi1411/BigDataProgramming GitHub Wiki
Name : Bhargavi Saipoojitha Chennupati
Class id : 4
Topic : GraphX and GraphFrames
1.Import the dataset as a csv file and create data framesdirectly on import than create graph out of the data frame created.
We have imported the datasets trip_data.csv and station_data.csv and we have created dataframes for the two datasets
Edges Output :
Temporary view is created to store edges and displayed.
Vertices Output :
Temporary view is created to store vertices and displayed .
2.Triangle Count
Here we are calculating the number of triangles passing through each vertex using Triangle Count Function.
Output :
It displays the id and Triangle count as output.
3.Find Shortest Paths w.r.t. Landmarks We have selected Japantown and Santa Clara County Civic Centre as landmarks are selected and found the shortest paths for them.
Output :
4.Apply Page Rank algorithm on the dataset.
PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important is the website.
The probability is reset from 0.15 to 0.01.
Output :
5.Save graphs generated to a file.
The graphs that are generated for both the vertices and the edges are to be stored in a separate vertices and edges folder. It is stored in Graph1 folder.
Output :
BONUS :
1.Apply Label Propagation Algorithm
Maximum 5 iterations are done and id, label columns are displayed.
Output:
2.Apply BFS algorithm
This algorithm finds the shortest path from one vertex to another vertex.
We perform BFS for those elements with id = Japantown to dock count value less than 15