Amazon S3 Observer setup guide - snowplow-archive/sauna GitHub Wiki

HOME > GUIDE FOR DEVOPS > SETTING UP SAUNA > OBSERVERS > AMAZON S3 OBSERVER SETUP GUIDE

Contents

Overview

This observer monitors an Amazon S3 bucket (and, optionally, a sub-path within that bucket) - when a new file is created in the bucket or sub-path, Sauna will trigger a responder action.

Compatibility

This observer was released in Sauna version 0.1.0.

Amazon S3 setup

To use the S3 Observer you will need to prepare a S3 bucket and SQS queue using the AWS CLI. If you already a have S3 bucket and SQS queue receiving notifications, feel free to skip this section.

At the end of this section you'll have an S3 bucket which pushes notifications to a configured AWS SQS queue. Note that the steps below assume that your bucket and queue will be configured in a most-restrictive manner - you may want to grant them additional permissions if necessary.

First of all, create the bucket and queue:

$ aws s3api create-bucket --bucket sauna-landing-bucket

$ aws sqs create-queue --queue-name sauna-notifications-queue

Please note that AWS resource names are global, so you will need to replace sauna-landing-bucket and sauna-notifications-queue with your actual bucket and queue names.

The next step is to allow S3 to push notifications into the SQS queue using a policy. Create a file called snowplow-sauna-observer-policy.json with the following content (012345678912 should be replaced with your 12-digit AWS Account Id, and S3 bucket and SQS queue names should be changed to yours'):

{
    "Policy": "{\"Version\":\"2008-10-17\",\"Id\":\"snowplow-sauna-observer-policy\",\"Statement\":[{\"Sid\":\"snowplowanalytics-sauna-queue-sid\",\"Effect\":\"Allow\",\"Principal\":{\"AWS\":\"*\"},\"Action\":\"SQS:SendMessage\",\"Resource\":\"arn:aws:sqs:us-east-1:012345678912:sauna-notifications-queue\",\"Condition\":{\"ArnLike\":{\"aws:SourceArn\":\"arn:aws:s3:*:*:sauna-landing-bucket\"}}}]}"
}

To apply this policy, run the following command:

$ aws sqs set-queue-attributes --queue-url https://queue.amazonaws.com/012345678912/sauna-notifications-queue --attributes file://snowplow-sauna-observer-policy.json

The last step is to make S3 push notifications to the queue. Create another file called snowplow-sauna-notification.json with the following content:

{
    "QueueConfigurations": [
        {
            "Id": "snowplow-sauna-notification",
            "QueueArn": "arn:aws:sqs:us-east-1:719197435995:sauna-notifications-queue",
            "Events": [
                "s3:ObjectCreated:*"
            ]
        }
    ]
}

And run the command below to apply this configuration:

$ aws s3api put-bucket-notification-configuration --bucket sauna-landing-bucket --notification-configuration file://snowplow-sauna-notification.json

Policy

Here's an example of a policy that can list/put/delete S3 objects and send/receive notifications to/from AWS SQS:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Stmt1471525597004",
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:DeleteObject",
                "s3:PutObject",
                "s3:GetBucketNotification",
                "s3:PutBucketNotification"
            ],
            "Resource": [
                "arn:aws:s3:::sauna-landing-bucket/*"
            ]
        },
        {
            "Sid": "Stmt1471525597024",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::sauna-landing-bucket"
            ]
        },
        {
            "Sid": "Stmt1471525908000",
            "Effect": "Allow",
            "Action": [
                "sqs:ReceiveMessage",
                "sqs:DeleteMessage"
            ],
            "Resource": [
                "arn:aws:sqs:us-east-1:012345678912:sauna-notifications-queue"
            ]
        },
        {
            "Sid": "Stmt1471525908001",
            "Effect": "Allow",
            "Action": [
                "sqs:ListQueues"
            ],
            "Resource": [
                "arn:aws:sqs:us-east-1:012345678912:*"
            ]
        }
    ]
}

Don't forget to replace 012345678912 with your AWS Account ID and sauna-notifications-queue and sauna-landing-bucket with SQS Queue and S3 Bucket respectively.

Also, you can check out this detailed AWS documentation page for ways to get messages from your S3 bucket.

Sauna setup

Avro Schema

The Amazon S3 Observer must be configured using a self-describing Avro which validates against this Schema:

iglu:com.snowplowanalytics.sauna.observers/AmazonS3Config/avro/1-0-0

Example

We can enable this observer by placing the following Avro configurations to the configuration directory (the config files must use .avro or .json extensions):

{
    "schema": "iglu:com.snowplowanalytics.sauna.observers/AmazonS3Config/avro/1-0-0",

    "data": {
        "enabled": true,
        "id": "com.acme.MyBucketObserver",
        "parameters": {
            "awsRegion": "us-west-2",
            "awsAccessKeyId": "...",
            "awsSecretAccessKey": "...",
            "sqsQueueName": "..."
        }
    }
}

The observer's id must be unique among all configuration files.

⚠️ **GitHub.com Fallback** ⚠️