Apache Spark - AshokBhat/ml GitHub Wiki

About

Distributed general-purpose computing framework
Addresses the limitations of Hadoop MapReduce. Spark reads data into memory, performs necessary operations, and writes results back—this allows for fast processing time, as opposed to MapReduce where each iteration requires disk read and write.

Architecture

ML on Spark

Machine learning frameworks on Spark: Apache Spark’s MLlib, H2O.ai’s Sparkling Water,..
DL frameworks on Spark: CERN’s Distributed Keras, Intel’s BigDL, Yahoo’s TensorFlowOnSpark...

See also

MLlib