ICP3 - GeoSnipes/Big-Data GitHub Wiki
Sub-Team Members Class ID: 5-2 15 Naga Venkata Satya Pranoop Mutha 5-2 23 Geovanni West
This ICP is related to get familiar with concepts of Linear Regression, Supervised Learning and Unsupervised Learning, and Clustering of Data. In this ICP, we take 3D Road Network data and apply Linear Regression fit to it. Then we observe the Training Mean Square Error and Test Mean Square Error.
Linear Regression:
Input Data:
Then we load and parse the data
Then we build the model
Now, we evaluate training mean square error and test mean square error
Next we save and load the result into file
Results
Training Mean Squared Error = 1.95062302120918E16 Test Mean Squared Error = 1.9596847563029088E16
K - Means Clustering
Case 1: K = 3
Source Code:
Outlier Point:
- Within Set Sum of Squared Errors = 8.58246791862488E14
Case 2 : K = 4
Source Code:
Outlier Point 1:
Outlier Point 2:
- Within Set Sum of Squared Errors = 2.031178299721398E14