Amazon S3 Observer setup guide - snowplow-archive/sauna GitHub Wiki
HOME > GUIDE FOR DEVOPS > SETTING UP SAUNA > OBSERVERS > AMAZON S3 OBSERVER SETUP GUIDE
- 1. Overview
- 2. Compatibility
- 3. Amazon S3 setup
- 4. Sauna setup
- 4.1 Avro Schema
- 4.2 Example
This observer monitors an Amazon S3 bucket (and, optionally, a sub-path within that bucket) - when a new file is created in the bucket or sub-path, Sauna will trigger a responder action.
This observer was released in Sauna version 0.1.0.
To use the S3 Observer you will need to prepare a S3 bucket and SQS queue using the AWS CLI. If you already a have S3 bucket and SQS queue receiving notifications, feel free to skip this section.
At the end of this section you'll have an S3 bucket which pushes notifications to a configured AWS SQS queue. Note that the steps below assume that your bucket and queue will be configured in a most-restrictive manner - you may want to grant them additional permissions if necessary.
First of all, create the bucket and queue:
$ aws s3api create-bucket --bucket sauna-landing-bucket
$ aws sqs create-queue --queue-name sauna-notifications-queue
Please note that AWS resource names are global, so you will need to replace sauna-landing-bucket
and sauna-notifications-queue
with your actual bucket and queue names.
The next step is to allow S3 to push notifications into the SQS queue using a policy. Create a file called snowplow-sauna-observer-policy.json
with the following content (012345678912
should be replaced with your 12-digit AWS Account Id, and S3 bucket and SQS queue names should be changed to yours'):
{
"Policy": "{\"Version\":\"2008-10-17\",\"Id\":\"snowplow-sauna-observer-policy\",\"Statement\":[{\"Sid\":\"snowplowanalytics-sauna-queue-sid\",\"Effect\":\"Allow\",\"Principal\":{\"AWS\":\"*\"},\"Action\":\"SQS:SendMessage\",\"Resource\":\"arn:aws:sqs:us-east-1:012345678912:sauna-notifications-queue\",\"Condition\":{\"ArnLike\":{\"aws:SourceArn\":\"arn:aws:s3:*:*:sauna-landing-bucket\"}}}]}"
}
To apply this policy, run the following command:
$ aws sqs set-queue-attributes --queue-url https://queue.amazonaws.com/012345678912/sauna-notifications-queue --attributes file://snowplow-sauna-observer-policy.json
The last step is to make S3 push notifications to the queue. Create another file called snowplow-sauna-notification.json
with the following content:
{
"QueueConfigurations": [
{
"Id": "snowplow-sauna-notification",
"QueueArn": "arn:aws:sqs:us-east-1:719197435995:sauna-notifications-queue",
"Events": [
"s3:ObjectCreated:*"
]
}
]
}
And run the command below to apply this configuration:
$ aws s3api put-bucket-notification-configuration --bucket sauna-landing-bucket --notification-configuration file://snowplow-sauna-notification.json
Here's an example of a policy that can list/put/delete S3 objects and send/receive notifications to/from AWS SQS:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Stmt1471525597004",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:DeleteObject",
"s3:PutObject",
"s3:GetBucketNotification",
"s3:PutBucketNotification"
],
"Resource": [
"arn:aws:s3:::sauna-landing-bucket/*"
]
},
{
"Sid": "Stmt1471525597024",
"Effect": "Allow",
"Action": [
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::sauna-landing-bucket"
]
},
{
"Sid": "Stmt1471525908000",
"Effect": "Allow",
"Action": [
"sqs:ReceiveMessage",
"sqs:DeleteMessage"
],
"Resource": [
"arn:aws:sqs:us-east-1:012345678912:sauna-notifications-queue"
]
},
{
"Sid": "Stmt1471525908001",
"Effect": "Allow",
"Action": [
"sqs:ListQueues"
],
"Resource": [
"arn:aws:sqs:us-east-1:012345678912:*"
]
}
]
}
Don't forget to replace 012345678912
with your AWS Account ID and sauna-notifications-queue
and sauna-landing-bucket
with SQS Queue and S3 Bucket respectively.
Also, you can check out this detailed AWS documentation page for ways to get messages from your S3 bucket.
The Amazon S3 Observer must be configured using a self-describing Avro which validates against this Schema:
iglu:com.snowplowanalytics.sauna.observers/AmazonS3Config/avro/1-0-0
We can enable this observer by placing the following Avro configurations to the configuration directory (the config files must use .avro
or .json
extensions):
{
"schema": "iglu:com.snowplowanalytics.sauna.observers/AmazonS3Config/avro/1-0-0",
"data": {
"enabled": true,
"id": "com.acme.MyBucketObserver",
"parameters": {
"awsRegion": "us-west-2",
"awsAccessKeyId": "...",
"awsSecretAccessKey": "...",
"sqsQueueName": "..."
}
}
}
The observer's id
must be unique among all configuration files.