2017_04_07_ _Using_Apache_Spark - VCityTeam/UD-SV GitHub Wiki

Spark SQL

  • Spark SQL is one of the modules constituting Apache Spark
  • Spark can "unify" heterogeneous (technology and structure) by "projecting" their respective information model (Data Sets) onto a single table (Data Frame). In doing so the objective of Spark was not so to abstract information models but to allow for optimized usages (specially reading data yet quite slow for writing). Not that the "Data Framing" process flattens out all the possibly existing original (within the underlying data base) object (set of attributes) and entity/relationship data structures...
  • The Data Frame API is available in Scala, Java and Python (although the Python interface is not central and still quite unstable).
  • Spark is local to host (API access as opposed to REST which is http based).

In case of using Java to write the computational model

When using a relational data base (which is the case of 3DCityDB that uses postGRESQL) data bindings can be realized through the Java Persistence API (JPA).

⚠️ **GitHub.com Fallback** ⚠️