Monitoring and alerting overview - openmrs/openmrs-contrib-itsmresources GitHub Wiki
Our main internal monitoring is based on Datadog. This will have all the details when machines are down, running out of disk, memory or CPU. It will be sent by infrastructure email.
Pingdom have HTTP checks on public services. Pingdom will also create tickets in helpdesk.
We do have accounts in pageduty, which will alert whoever is on-call. Pageduty can be triggered by critical tickets on helpdesk, pingdom alerts.
We also have a dashboard with the status of our infrastructure, with data coming from pingdom.