Architectural thoughts initial - OpenData-tu/documentation GitHub Wiki

Version 0.2

Version Date Modified by Summary of changes
0.1 2017-05-05 Paul Wille Initial version
0.2 2017-05-19 Paul Wille ...

Usecase / System, that we want to implement

We agreed that we want to look at Berlin as our main target for which to gather data. This includes collecting data from Germany/Europe/The World, if it is granular enough to contain Berlin as a datapoint.

Tools/Process

DevOps

Deployment

Architecture Components

The app can be virtually divided into two main parts/functionalities.

  1. Data Collection / ETL / Importing
  2. Querying the data / anlyising the data / building a crazy, sexy usecase that uses our collected data

Scalaibiliity

Concerning scalability #2 should be infinitely scalable, as in future usecases this part could have nearly infinite usergroups, while #1 does not require fastest speed and infinite scalability. The importing-process should be fast aswell and should be scalable but the service/app does not rely on that process running infitely fast.

So making tha database and tha API for accessing the data as highly available as possible should be one requirement for our service.

Frontend

Importing / ETL

Oliver, Andres, Paul

Databases

Probably goes hand in hand with deployment, as scalability and availability mostly affects the data-storing and data-axxessing part.

API

The frontend

Deployment

Nico, Amar

This means:

  • Design the system in a way to scale and that
  • it can be deployed in a way it will scale (indefinitely) but also runs on a private cloud
  • Having a look on the whole toolchain
  • Having a look on the workchain / deploychain / devops

In order to have a service that can be run in public and private cloud and is transferable as easy as possible we want to dockerize any service that we implement to make deploying it on different systems easy.

First Ideas

Using ElasticSearch, deployed with cubernetes and docker

We agree, that we should be programming for the private cloud-part in the first place.

Moving a privately deployed system and their components to a cloud service will probabbly be much easier.