OpsGenie Responder user guide - snowplow-archive/sauna GitHub Wiki
HOME > GUIDE FOR ANALYSTS > RESPONDERS > OpsGenie Responder user guide
This responder has not yet been implemented.
See also: OpsGenie Responder setup guide
- 1. Overview
- 2. Responder actions
- 2.1 Create alert (real-time)
- 2.1.1 Overview
- 2.1.2 Command format
- 2.1.3 Example command
- 2.1.4 Response algorithm
- 2.1.5 Usage examples
- 2.1.5.1 Kinesis stream
- 2.1 Create alert (real-time)
This responder lets you interact with the [OpsGenie][opsgenie] incident management platform. So far, this responder just lets you create a new alert in OpsGenie.
Currently this responder supports one action:
Type | Identifier | Action performed in OpsGenie |
---|---|---|
Command | com.opsgenie.sauna.commands/create_alert |
Creates a new alert in OpsGenie |
This responder command lets you create a new alert within OpsGenie to notify your Ops team about an incident.
Like all real-time responder actions, this action is triggered by a well-structured JSON command being received by a compatible observer.
The command must be configured using a self-describing JSON Schema which validates against this schema:
iglu:com.opsgenie.sauna.commands/create_alert/jsonschema/1-0-0
NOTE TO TREVOR: PLEASE ADD ALL OF THE PROPERTIES FROM https://www.opsgenie.com/docs/web-api/alert-api#createAlertRequest INTO THIS COMMAND'S JSON SCHEMA, MAKING SURE TO INCLUDE ALL OF THE LENGTH/SIZE LIMITS
A simple OpsGenie create_alert
command will look as follows:
{
"schema": "iglu:com.opsgenie.sauna.commands/create_alert/jsonschema/1-0-0",
"data": {
"message": "SsoServer3 is down",
"alias": "SsoServer3-down",
"teams": [
"operations",
"developers"
],
"entity": "single-sign-on",
"note": "Additional",
"details": {
"lastCpuPct": 40.5,
"ec2InstanceId":"i-1be2beda"
}
}
}
Where:
-
message
is the alert text, no more than 130 characters -
alias
is a user-defined identifier for the alert; there can be only one alert with open status with the same alias, which makes this useful for preventing duplicate alerts when creating alerts via Sauna -
teams
is an array of team names which will be responsible for the alert, up to 50 of them -
entity
is the business or technical entity which teh -
note
is an additional alert note -
details
is set of user-defined properties, specified as a JSON sub-object
As with all RT responders, Sauna will take each command and:
- Validate it as a valid OpsGenie
create_alert
command - If it is not valid, this will be reported to any configured Sauna loggers
- If it is valid, Sauna will attempt to POST the alert to
https://api.opsgenie.com/v1/json/alert
using the command's data - If the API reports success (200/201/etc), this will be reported to any configured Sauna loggers
- If the API reports failure (400/403/etc), this will be reported to any configured Sauna loggers but no retry will be attempted
In the case of failure, we do not attempt retry, even in the case of us temporarily exceeding API rate limits, because this could block other non-OpsGenie-related commands in the observed stream from executing.
Assuming that the Amazon Kinesis Observer receives the following command:
{
"schema": "iglu:com.snowplowanalytics.sauna.commands/command/jsonschema/1-0-0",
"data": {
"envelope": {
"schema": "iglu:com.snowplowanalytics.sauna.commands/envelope/jsonschema/1-0-0",
"data": {
"commandId": "9dadfc92-9311-43c7-9cee-61ab590a6e81",
"whenCreated": "2017-01-02T19:14:42Z",
"execution": {
"semantics": "AT_LEAST_ONCE",
"timeToLive": 1200000
},
"tags": {}
}
},
"command": {
"schema": "iglu:com.pagerduty.sauna.commands/create_event/jsonschema/1-0-0",
"data": {
xxx
}
}
}
}
And assuming that the current time is within 20 minutes (1,200,000 ms) of 2017-01-02T19:14:42Z, then:
- Sauna will create a new
acknowledge
event to PagerDuty. The event will mark an incident identified byincident_key
as being worked on - this will prevent it from sending new notifications, but won't close it. - Whether or not the event was successfully send wil be reported to any configured Sauna loggers.