Wiki - delmarrerikaine/Scala-Course-Project GitHub Wiki

How to test solution

sbt docker
docker-compose run
./create_topics

please note that if streaming-app will exit before topics were created perform

docker-compose down
docker-compose up

Design

Overview

The base functionality was implemented here along with the following additional parts:

  • Solar Plant Generator is created based on actors
  • Records are enriched with weather per location of a solar plant
  • Scaling on a staging environment

Solar Plant Generator

We are modeling solar plants generator based on Akka structure. We have the following actors:

  • Sensor
  • Solar Panel
  • Solar Plant
  • Solar Plant Manager

Where Solar Plant Manager has multiple Solar Plants as children, which in turn has multiple Solar Panels as children etc. Data retrieval as well as transmitting it to the Kafka is mainly the responsibility of Solar Plant Manager. With a specified timeout, it will move request regarding information retrieval down the tree up the leaves, which are our sensors.

Then sensors fill all the relevant data and pass it up the tree. On each actor level, our data record is enriched with additional relevant information until it reaches the root Solar Plant Manager actor. The latter will push information to the Kafka topic via special DataWriter.

With this approach, we will have a clear encapsulation of details on each level, where from the end point we simply interact with a single Solar Plant Manager actor with the ability to configure a number of plants, sensors, etc. via global configs.

Weather Provider

Weather Provider is a service that makes scheduled requests for external weather API based on the location information. It calls for temperature, humidity and the timestamp at which the measurements were made. After the data is retrieved it then gets pushed to separate topic in order to be merged with other topics. The data is in JSON format - it is uniquely identifiable and easy to interpret. External API used is OpenWeatherMap - perhaps, the most popular one. It provides easy access to current weather at any location and has many additional information.

Streaming Application

Streaming application produces enriched records based on streams from weather and sensor topics. As was mentioned above, sensor records is a continuous stream of measurements taken from solar plants with a specified time interval. While weather data is a stream with an actual weather conditions on each geographical location, that is a changelog of an actual weather at a particular location.

So we manipulate with weather stream via KTable in order to constantly have that updated information there. Then, when each new measurement is received from sensor KStream, we join it with an actual weather conditions at that location and write produced record into enriched output topic.

Limitations

The application complies with requirements (50 solar panels 4 sensors each on 20 Plants send data every second). During performance testing, streaming-app was scaled 3 times. The solution was tested on the GCP with n1-standard-8 VM (8 vCPU and 30 GB RAM) - cost estimate: 204$ for computing engine + network