Micro Service Monitoring by IBM - UMKCNSF/UMKC--HACKATHON GitHub Wiki

Use case id: 3-IB-Microservice

Title: Micro Service Monitoring by IBM

Attendees:Paul-John, Mayank Raj, Srini Bhagavan

Presenter: Paul-John from IBM

Problem statement:

Cloud services are embracing the concept of microservices, which Wikipedia defines as "a variant of the service-oriented architecture’s (SOA) architectural style that structures an application as a collection of loosely coupled services." There are many factors that contribute to the success of a microservices architecture. One of these is high availability. A paid service that hopes to retain its customers should maintain high availability numbers and, when situations arise that create downtime, DevOps engineers need to be alerted so that service can be restored. Imagine the impact on customers every minute that Facebook is unavailable!

Usecase

1.You are to design and implement a basic monitoring and notification system for a cluster of Docker containers. Use docker compose to build a cluster with the following containers:

  • nginx

  • apache-httpd server

  • postgres or mysql

The monitoring code can reside either in its own container or outside the cluster.

Requirements for monitoring application:

 a) 20% Poll each docker container status every 5 minutes. Store the status of each in a time series database.

 b) 20% Maintain a history of the uptime data for at least 5 hours.

 c) 20% Create a watcher application that queries the most recent value in your time series database for each container. Whenever one is reported down, "page" (send an email to) the on-duty engineer (you provide the email address) with a notification that "application x went down at 17:59:59 on November 5, 2017. Please investigate". For this use case, hard-code the on-duty engineer's contact info.

d) 20% Your "alert" in (3) should include a link to a web service that will attempt to restart the docker container that went down. This will help someone away from their computer a chance to self-correct the issue. And if this "quick fix" were to not work for some reason, that's when the on-duty engineer would have to directly fix the problem.

e) 10% A demo of your application. Show the collected data. Bring down a container and show that your monitoring application detects and reports the situation correctly.

f) 10% Documentation of your application. Document your workflow, design, and all external libraries or packages you used.

Your application should be written in Java, Python, C, C++, C#, or Go.

Your code should be deployed on Linux, Windows, or Mac OS X.

You are allowed to import third-party libraries or extensions provided you (1) use them under license and (2) document their use.

Resource: https://en.wikipedia.org/wiki/Microservices

Note 👍 Additional points for innovation

Questions:

Please create an issue on Github with usecase-id or email to: [email protected], [email protected] and [email protected]