Section 5 Starting our EC2 instance - Green-Biome-Institute/AWS GitHub Wiki

Section 5 - Starting our EC2 instance

First, let’s start our EC2 Instance just as we did in the previous AWS module:

Navigate to the EC2 dashboard and click on the orange button in the top right to Launch Instances.
Select the RandyBioinformaticsEC2 AMI snapshot.
Filter by: r5 and then select r5.large. This is where you will be able to select larger EC2 instances for computational loads greater than this training workshop in the future.
Select “Next: Add Storage”
Add 50 GB of EBS volume storage capacity and then press “Next: Add Tags”
Add two key tags (in the future you can optimize these tags for however you would like to organize your projects):

“user” : [your name]
“project” : “training workshop”

Select “Next: Configure Security Group”
Select “launch-wizard-1” and then “Review and Launch
Press “Launch”
In the window that pops up, select “Create a new key pair”, name it in the box “Key pair name”, download the keypair file to a secure location on your computer, and continue to launch the EC2 instance.

Logging In & Uploading Data

Now that our EC2 instance is up and running, let’s log in and load some information onto it so we can work with it!

First, we need to make sure our keypair has the right permissions. Open up your terminal and navigate to the directory where you saved your keypair.pem file. Use the following command to give that file the correct permissions:

$ chmod 700 [your-keypair].pem

Now, let’s log into the EC2 instance. To do this we need a couple pieces of information. We need The PATH to our new keypair that we saved above The EC2 instance IP address

Put those together:

$ ssh -i PATH/to/keypair.pem ubuntu@IP-ADDRESS
For example:
$ ssh -i ~/Desktop/GBI/AWS/AWS_keypairs/r5large-gbi-keypair.pem [email protected]

This will log us in!

If you list the contents of the EC2 instance, you’ll see some directories for organization, but no data yet. Let’s fix that.

Open up a new terminal (don’t exit out of the one you are using to work on the EC2 instance) and create a new directory on your desktop called “test-data”.

We will be uploading some arabidopsis thaliana illumina data that I have subsampled from the data set SRR1946554. To download this, scp from the currently running EC2 instance:

scp