Architectural thoughts initial - OpenData-tu/documentation GitHub Wiki
Version 0.2
Version | Date | Modified by | Summary of changes |
---|---|---|---|
0.1 | 2017-05-05 | Paul Wille | Initial version |
0.2 | 2017-05-19 | Paul Wille | ... |
Usecase / System, that we want to implement
We agreed that we want to look at Berlin as our main target for which to gather data. This includes collecting data from Germany/Europe/The World, if it is granular enough to contain Berlin as a datapoint.
Tools/Process
DevOps
Deployment
Architecture Components
The app can be virtually divided into two main parts/functionalities.
- Data Collection / ETL / Importing
- Querying the data / anlyising the data / building a crazy, sexy usecase that uses our collected data
Scalaibiliity
Concerning scalability #2 should be infinitely scalable, as in future usecases this part could have nearly infinite usergroups, while #1 does not require fastest speed and infinite scalability. The importing-process should be fast aswell and should be scalable but the service/app does not rely on that process running infitely fast.
So making tha database and tha API for accessing the data as highly available as possible should be one requirement for our service.
Frontend
Importing / ETL
Oliver, Andres, Paul
Databases
Probably goes hand in hand with deployment, as scalability and availability mostly affects the data-storing and data-axxessing part.
API
The frontend
Deployment
Nico, Amar
This means:
- Design the system in a way to scale and that
- it can be deployed in a way it will scale (indefinitely) but also runs on a private cloud
- Having a look on the whole toolchain
- Having a look on the workchain / deploychain / devops
In order to have a service that can be run in public and private cloud and is transferable as easy as possible we want to dockerize any service that we implement to make deploying it on different systems easy.
First Ideas
Using ElasticSearch, deployed with cubernetes and docker
We agree, that we should be programming for the private cloud-part in the first place.
Moving a privately deployed system and their components to a cloud service will probabbly be much easier.