Wiki - delmarrerikaine/Scala-Course-Project GitHub Wiki
How to test solution
sbt docker
docker-compose run
./create_topics
please note that if streaming-app will exit before topics were created perform
docker-compose down
docker-compose up
Design
Overview
The base functionality was implemented here along with the following additional parts:
- Solar Plant Generator is created based on actors
- Records are enriched with weather per location of a solar plant
- Scaling on a staging environment
Solar Plant Generator
We are modeling solar plants generator based on Akka structure. We have the following actors:
- Sensor
- Solar Panel
- Solar Plant
- Solar Plant Manager
Where Solar Plant Manager has multiple Solar Plants as children, which in turn has multiple Solar Panels as children etc. Data retrieval as well as transmitting it to the Kafka is mainly the responsibility of Solar Plant Manager. With a specified timeout, it will move request regarding information retrieval down the tree up the leaves, which are our sensors.
Then sensors fill all the relevant data and pass it up the tree. On each actor level, our data record is enriched with additional relevant information until it reaches the root Solar Plant Manager actor. The latter will push information to the Kafka topic via special DataWriter.
With this approach, we will have a clear encapsulation of details on each level, where from the end point we simply interact with a single Solar Plant Manager actor with the ability to configure a number of plants, sensors, etc. via global configs.
Weather Provider
Weather Provider is a service that makes scheduled requests for external weather API based on the location information. It calls for temperature, humidity and the timestamp at which the measurements were made. After the data is retrieved it then gets pushed to separate topic in order to be merged with other topics. The data is in JSON format - it is uniquely identifiable and easy to interpret. External API used is OpenWeatherMap - perhaps, the most popular one. It provides easy access to current weather at any location and has many additional information.
Streaming Application
Streaming application produces enriched records based on streams from weather and sensor topics. As was mentioned above, sensor records is a continuous stream of measurements taken from solar plants with a specified time interval. While weather data is a stream with an actual weather conditions on each geographical location, that is a changelog of an actual weather at a particular location.
So we manipulate with weather stream via KTable in order to constantly have that updated information there. Then, when each new measurement is received from sensor KStream, we join it with an actual weather conditions at that location and write produced record into enriched output topic.
Limitations
The application complies with requirements (50 solar panels 4 sensors each on 20 Plants send data every second). During performance testing, streaming-app was scaled 3 times. The solution was tested on the GCP with n1-standard-8 VM (8 vCPU and 30 GB RAM) - cost estimate: 204$ for computing engine + network