2017_04_07_ _Using_Apache_Spark - VCityTeam/UD-SV GitHub Wiki

Spark SQL

Spark SQL is one of the modules constituting Apache Spark
Spark can "unify" heterogeneous (technology and structure) by "projecting" their respective information model (Data Sets) onto a single table (Data Frame). In doing so the objective of Spark was not so to abstract information models but to allow for optimized usages (specially reading data yet quite slow for writing). Not that the "Data Framing" process flattens out all the possibly existing original (within the underlying data base) object (set of attributes) and entity/relationship data structures...
The Data Frame API is available in Scala, Java and Python (although the Python interface is not central and still quite unstable).
Spark is local to host (API access as opposed to REST which is http based).

When using a relational data base (which is the case of 3DCityDB that uses postGRESQL) data bindings can be realized through the Java Persistence API (JPA).