Benchmarks - DSE-capstone-sharknado/main GitHub Wiki
BPR Speed Tests
Goal: Test BPR code to see where it scales the best. The results are intended to inform our decision making process on what technology to use to scale our analysis.
Code Tested (Local): https://github.com/DSE-capstone-sharknado/bpr/blob/master/sampling_test.py
Commit: a7f0dbc
Test Date: 2017-02-20
Spark Notebook: https://dbc-f6057a15-2f8d.cloud.databricks.com/#notebook/102637
Results:
13 Min - Alex's MBP
Julus' Macbook Pro:
real 15m54.309s
user 15m45.995s
sys 0m4.847s
2 Node Spark Cluster:
1 loops, best of 3: 14min 33s per loop
Cluster Setup
On Demand Driver (15 GB Memory, 8 Cores, 1 DBU) x2
A Databricks Unit ("DBU") is a unit of Apache Spark processing capability per hour. Learn more Workers (30 GB Memory, 16 Cores, 2 DBU)