ICP 12 - bhargavi1411/BigDataProgramming GitHub Wiki

Name : Bhargavi Saipoojitha Chennupati

Class Id : 4

Topic : Graph Frames and GraphX

Part –1:

1.Import the dataset as a csv file and create data framesdirectly on importthan create graph out of the data frame created.

2.Concatenate chunks into list & convert to DataFrame

Output :

3.Remove duplicates

Output :

4.Name Columns

Output :

5.Output DataFrame

Output :

6.Create vertices

Output :

7.Show some vertices

Output :

8.Show some edges

Output :

9.Vertex in-Degree

Output :

10.Vertex out-Degree

Output :

11.Apply the motif findings.

Output :

Bonus :

1.Vertex degree

Output :

2.What are the most common destinations in the dataset from location to location?

Output :

3.What is the station with the highest ratio of in degrees but fewest out degrees? As in, what station acts as almost a pure trip sink? A station where trips end at but rarely start from.

Output :

4.Save graphs generated to a file.

Output :