Setting up a S3 Bucket - StanfordBioinformatics/pulsar_lims GitHub Wiki

Summary

We'll refer to your new instance of Pulsar LIMS that you are deploying as lims, but you should use the application name of your choosing.

In Pulsar LIMS, when a user uploads a given file, whether that be a protocol document or an image, the file will either be stored directly in the database, or in an AWS S3 bucket. Currently, images associated with cloning_vectors and PCR or immunoblot gels are uploaded into S3. However, protocol documents may make their way into S3 soon instead of the database, and other object types that require the user to upload static assets may end up being stored in the same S3 bucket. Thus, you must have an S3 bucket configured to be able to use Pulsar.

Objectives:

1. Create AWS IAM users 
2. Create an S3 bucket  
3. Configure your bucket  

Create IAM Users

AWS recommends creating Identity and Access Management (IAM), users for managing access to AWS resources. They are more secure compared to handing over your direct AWS account credentials to clients, as you can delegate with fine detail the specific privileges each IAM user is to possess. Pulsar will use the AWS Access Key ID and AWS Secret Access Key of an IAM user to both upload files to and read files from your dedicated S3 bucket.

Create an IAM admin group and IAM user with administrative privileges

Log into your AWS account and create an IAM user for yourself for when working in the AWS Management Console, and give this user administrative privileges. You'll use this admin IAM user to create other IAM users (i.e. one for Pulsar to use), create an S3 bucket, and set user and bucket policies. You should create a group (i.e. "Admins") and add this new user to the group. Using a group is convenient here because there may be other admins (now or in the future) that may need to be added to the group, and you can attach an administrative policy to the group, which each user in the group will inherit.

Once you have an admins group with your admin IAM user in it, attach the AdministratorAccess policy to the group if you have not done so yet using the instructions here. Next, you'll need to logout and then log back in with your admin IAM user account. Before logging out, select the "Dashboard" link in the left navbar of the AWS Management Console. Copy the IAM user sign-in link underneath where it says "IAM users sign-in link:". Now you can sign out and point your web browser to the copied URL and sign in with your admin IAM user credentials. Details instructions at http://docs.aws.amazon.com/IAM/latest/UserGuide/getting-started_create-admin-group.html.

Create an IAM User for Pulsar

While signed into the AWS IAM Console as the admin IAM user you created in the section above, create a new user that Pulsar can use to access the S3 bucket that you'll create later. Name that user as you see fit, i.e. after your app's name is will work. Here we'll refer to that IAM user as "lims".

Create AWS S3 Bucket

While signed into the AWS IAM Console as the admin IAM user, follow these instructions to create an S3 bucket. When selecting the "Region", you should choose the same region that Pulsar is deployed in or will be deployed in (if deployed also on AWS). At the time of this writing, in the US, Heroku deploys apps in the us-east-1 zone; see the Heroku regions documentation for more details. Here we'll name the bucket after our app and suffix it with "-assets": lims-assets.

Attach a bucket policy

Now you'll configure the bucket you just created to accept read and write actions carried out by the specified IAM User you created for Pulsar. Click on the bucket you just created, select the "Permissions" tab, select the "Bucket Policy" tab, then paste in the following bucket policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "bucket",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::167194893449:user/lims"
            },
            "Action": [
                "s3:GetBucketLocation",
                "s3:ListBucket"
            ],
            "Resource": "arn:aws:s3:::lims-assets"
        },
        {
            "Sid": "object",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::167194893449:user/lims"
            },
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:DeleteObject",
                "s3:GetObjectAcl",
                "s3:PutObjectAcl"
            ],
            "Resource": "arn:aws:s3:::lims-assets/*"
        }
    ]
}

If you used a different name for your bucket, then you'll need to substitute the two occurrences of "lims-assets" with your bucket name. Also, you'll have to modify the Amazon Resource Name (ARN) for the pulsar user you created. Change the two occurrences of "arn:aws:iam::167194893449:user/lims" above to your user's ARN. You can find out what your ARN is by navigating back to the AWS IAM Console, selecting Users, then selecting your pulsar user.

CORS configuration

Now that you have an S3 bucket and bucket policy set, there is one more thing you need to do before your Pulsar app can access the bucket, and that is to configure Cross-origin Resource Sharing (CORS) on your bucket. By default, web browsers prevent a web application from making requests for resources that reside outside of the web application's domain. This is a good thing. For example, this prevents an (unknowingly) malicious application you may be signed into from making JavaScript requests to other web applications you may be signed into, such as an online banking application. This can be relaxed through CORS. CORS allows for you to specify which resources in a particular web application are allowed to be accessed, how they can be accessed, and the domains that can access them. If we configure our S3 bucket using CORS to allow read and write operations from our Pulsar application, then web browsers will no longer block such traffic.

In the AWS Management Console, navigate to your bucket, select the "Permissions" tab, select the "CORS configuration" tab, then paste in the following:

<?xml version="1.0" encoding="UTF-8"?>
<CORSConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<CORSRule>
    <AllowedOrigin>http://localhost:5000</AllowedOrigin>
    <AllowedOrigin>https://lims.herokuapp.com</AllowedOrigin>
    <AllowedMethod>GET</AllowedMethod>
    <AllowedMethod>PUT</AllowedMethod>
    <AllowedMethod>POST</AllowedMethod>
    <MaxAgeSeconds>3000</MaxAgeSeconds>
    <AllowedHeader>*</AllowedHeader>
</CORSRule>
</CORSConfiguration>

Origin http://localhost:5000 is allowed because you'll want to probably run lims locally for testing purposes, hence localhost. Port 5000 is specified since that is the default port used by the Puma web server that Pulsar is configured to use in both development and production environments.

The second allowed origin https://lims.herokuapp.com is the URL of your deployed lims application.

⚠️ **GitHub.com Fallback** ⚠️