How To Run Your Own Formhub Instances on Amazon Web Services - SEL-Columbia/formhub GitHub Wiki

Run Your Own Formhub Instances on Amazon Web Services (AWS)

This is a guide for people who wish to use the public Formhub Amazon Machine Image (AMI) to run their own instance(s) of Formhub on AWS with minimal fuss.

The public Formhub AMI is a pre-built server image which contains everything necessary to run Formhub in a stable, self-contained server environment.

The public Formhub AMI is based on the current master source code branch, implemented in debian and postgresql.

Login to AWS

  1. Create an AWS account, if you do not already have one

  2. Login to the AWS Management Console and choose EC2 Virtual Servers in the Cloud, second from the top on the left (click the following image to enlarge):

  1. Click on the blue Launch Instance button which appears in the middle of the EC2 Dashboard:

Launching an Instance

  1. Step 1: Choose an Amazon Machine Image (AMI)

Click on Community AMIs on the left, and type Formhub in the search window.

The Formhub AMI should appear as the lone choice (the current, latest version is Formhub-Self-Contained-AMI-v.0.0.1, ami-c421d3ac). Click on the blue Select button on the right to continue.

If you do not see the Formhub API appear, make sure your account region is set to US East (N. Virginia) by choosing it from the drop-down list in the upper right of the AWS Management Console, which appears in between your account name and the Help menu:

  1. Step 2: Choose an Instance Type

At the next screen, you need to choose an Instance Type

What you pick here depends on your budget and data processing needs. As a rule of thumb, if you are doing fewer than one hundred form submissions per day, the default t1.micro choice should suffice.

When you have decided, click on the gray Next: Configure Instance Details button at the bottom right.

  1. Step 3: Configure Instance Details

Unless you have a special reason for not doing so, leave all the default settings here as-is, and click on the gray Next: Add Storage button at the bottom right.

  1. Step 4: Add Storage

By default, the Formhub AMI comes with eight gigabytes of disk space. If you think you need more, change the number in the Size field.

Click the gray Next: Tag Instance button at the bottom right to continue.

  1. Step 5: Tag Instance

This is optional, but useful, if you run multiple instances with your AWS account.

  1. Step 6: Configure Security Group

Choose Create a new security group and tap the gray Add Rule button at the bottom left.

Select HTTP from the list that appears.

Your security group definition should look like this:

If you want anyone to be able to post forms to your Formhub instance, you can ignore the open to the world warning; otherwise, change the IP address range to block access accordingly.

Strictly speaking, this is all you need, but if you are planning to run Formhub under your own domain and expect to receive incoming email and use a a Secure Sockets Layer (SSL) Certificate, then you also need the SMTP and HTTPS rules, respectively:

Give your Security Group a name, then click the blue Review and Launch button.

  1. Step 7: Review Instance Launch

At the final screen, go back and make any changes if needed, otherwise click on the blue Launch button in the lower right.

In the very last step to launch, you will be prompted for a key pair.

Unless you already have an existing key pair for your account, choose Create a new key pair and type in a name, Foobar, for example.

You will be prompted to save the resulting file (e.g., Foobar.pem) to your computer.

Store it in a secure place, and make sure you do not lose this file, as it is the only way to access your running Formhub instance.

Post Launch Configuration

  1. Use an Elastic IP address for your instance

Unless you are using your Formhub instance fleetingly, for very short ad-hoc experiments, it is a good idea to create an elastic IP address and associate it to your running instance.

Back in the EC2 Dashboard, click on Elastic IPs on the left, and tap the blue Allocate New Address button:

Leave EC2 as the selection in the list, and click on the blue Yes, Allocate button which appears.

Once an IP address is made available, click on it to select it (left-click), then, with the address highlighted in light blue, right-click your mouse for a secondary menu, like this (your actual IP address number will be different):

Click on Associate Address at the bottom, and input the name of your running instance at the next screen.

All your running instances will appear in the next input field as autocompletion choices; if you have more than one instance running in your account, make a note of the specific id (i-XXXXXXX) which is running Formhub.

  1. Create an S3 Bucket for your files

This is optional, but strongly recommended, since it allows you to store attachments and other media separate from your running instance.

From the AWS Console, select S3 Scalable Storage in the Cloud:

Then click on the blue Create Bucket button:

Type a name for your bucket, make a note of it, and click on the gray Create button:

  1. Make sure it's working

Copy the elastic IP address number you got in step 1, and type it into a web browser's address bar.

You should see Formhub beaming back at you, in all its glory:

Your instance has no data in it yet, which is why clicking on Forms (i.e., the shared, public forms found on Formhub.org) return no results, and you cannot login as anyone yet (because you need to signup first).

  1. Configure Outgoing Email

So while everything looks ok, you will find that you cannot actually sign up yet.

If you try it now, this happens:

The reason is Formhub tries to send an email to each new user, for confirmation, but AWS is very strict about allowing outgoing email from any of its instances, so you have to register with Amazon's Simple Email Service (SES) first.

Follow the SES instructions and come back to this guide when you are done (don't worry, we'll wait).

Make sure to request full production access to be able to both send and receive email.

While you are logged in to your AWS account, visit the Security Credentials page, and make a note of both your Access Key ID and Secret Access Key, since you will need both for the next (and final) step.

  1. Update the Django Settings file

For this very last step, we have to leave the comfortable world of the AWS Management Console behind, and gasp, type in a bunch of commands from a terminal.

It's not as daunting as it might sound, though, and you will only ever have to do this once per launched instance, so double-down on the chocolate milk (or fortifying drink of your choice) and follow along.

On Mac OSX or Linux, use Terminal.

On Windows, use PuTTY, and convert your pem file to ppk format.

Your ssh connection target is admin@[Your Elastic IP Address] so in Mac OSX or Linux, type this in a new terminal prompt:

ssh -i ~/.ssh/Foobar.pem [email protected]

Replace 54.243.44.203 with your instance's actual IP address, and make sure the path to and name of your pem file is correct.

Windows/PuTTY users, type [email protected] in the Host Name (or IP Address) input dialog in the Session screen, and use the default port 22 for access. Using Pageant together with PuTTY is also recommended.

Troubleshooting: if you cannot login, try the suggestions in the AWS Guide Troubleshooting Connecting to Your Instance.

Once you are logged in, the remaining commands are exactly the same.

If you get lost, or mistype something, just exit from the terminal/stop PuTTY and try it again.

You will be greeted by this text:

Linux ip-10-69-144-231 3.2.0-4-amd64 #1 SMP Debian 3.2.57-3+deb7u1 x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
admin@ip-10-69-144-231:~$ 

The series of hyphenated numbers after ip- will vary, but otherwise, your screen should be the same.

Now, let's change that settings file.

Become the root user by typing in this command, exactly like this, with spaces between the hyphen (dash) and the words, and hit Enter:

sudo su - root

Your terminal screen should look like this (and again, the hyphenated numbers after ip- will be different from yours, but that doesn't matter):

admin@ip-10-69-144-231:~$ sudo su - root
root@ip-10-69-144-231:~# 

Now switch to the fhuser, which is the dedicated account running your instance of Formhub.

Type this command and hit Enter (again, remember the spaces):

su - fhuser

Your terminal screen should now look like this:

admin@ip-10-69-144-231:~$ sudo su - root
root@ip-10-69-144-231:~# su - fhuser
fhuser@ip-10-69-144-231:~$ 

Let's get to where the settings file is kept. Type this command at the fhuser prompt and hit Enter:

cd formhub/formhub/preset

Your terminal session so far should look like this:

admin@ip-10-69-144-231:~$ sudo su - root
root@ip-10-69-144-231:~# su - fhuser
fhuser@ip-10-69-144-231:~$ cd formhub/formhub/preset

Make a backup copy of the file before changing it. Type this command and hit Enter:

cp -p default_settings.py default_settings.py-bak

Your terminal session should now look like this:

admin@ip-10-69-144-231:~$ sudo su - root
root@ip-10-69-144-231:~# su - fhuser
fhuser@ip-10-69-144-231:~$ cd formhub/formhub/preset
fhuser@ip-10-69-144-231:~/formhub/formhub/preset$ cp -ip default_settings.py default_settings.py-bak

We're finally ready to change the settings file, so that it uses the correct Access Key ID and Secret Access Key for your AWS account.

You should also know the exact email address you used to register with SES in the step before this one.

Type this command to edit the settings file (change nano to vi, etc., if you know your way around a linux command line and prefer a different editor):

nano default_settings.py

The screen will change, filling up with the entirety of the settings file.

The lines at the very top and bottom will appear in reverse-highlight text, but everything else will be in plain text, like this:

  GNU nano 2.2.6                                   File: default_settings.py                                                                            

# This system uses structured settings.py as defined in the
# second from last slide in this presentation:
# http://www.slideshare.net/jacobian/the-best-and-worst-of-django

# The basic idea is that a file like this, which is referenced when
# the django app runs, imports from ../settings.py, and over-rides
# and value there with a value specified here

# This file is checked into source control as an example, but
# your actual production settings, which contain database passwords
# and 3rd party private keys, etc., should perhaps be omitted using
# .gitignore

from formhub.settings import *

# For this example configuration, we are running the server in
# debug mode, but this should be changed to False for a server
# in production (changing the value of DEBUG also requires that
# ALLOWED_HOSTS, below, be defined as well)

DEBUG = True

# Hosts/domain names that are valid for this site
# This is required if DEBUG is False, otherwise the server
# will respond with 500 errors:
# https://docs.djangoproject.com/en/1.5/ref/settings/#allowed-hosts
#ALLOWED_HOSTS = ['.example.com']

# These are necessary for running on Amazon Web Services (AWS)
# because basic formhub/django functions which rely on email,
# such as new account registration, will fail

AWS_ACCESS_KEY_ID     = '' # find these in your AWS console
AWS_SECRET_ACCESS_KEY = ''
EMAIL_BACKEND = 'django_ses.SESBackend'
DEFAULT_FROM_EMAIL = '' # e.g., '[email protected]'
SERVER_EMAIL = DEFAULT_FROM_EMAIL

# Uncomment the following three lines if you are using
# an AWS S3 Bucket as the default file store, and define
# your bucket name in the AWS_STORAGE_BUCKET_NAME variable.
# This it is optional, but strongly recommended.

#DEFAULT_FILE_STORAGE = 'storages.backends.s3boto.S3BotoStorage'
#AWS_STORAGE_BUCKET_NAME = '' # use your S3 Bucket name here
#AWS_DEFAULT_ACL = 'private'


# In this example we are supplementing the django database
# definition found in the ../settings.py file with a password
# (normally we wouldn't check this into source control, but this
#  is here just for illustration, as an example of what's possible)

DATABASES['default']['PASSWORD'] = 'foo'
# an alternative to hard-coding the password string
# is to define the db password as an environment variable:
#DATABASES['default']['PASSWORD'] = os.environ['FORMHUB_DB_PWD']

# Examples of other over-rides you could do here:

DATABASE_ROUTERS = [] # turn off second database

# Make a unique unique key just for testing, and don't share it with anybody.
SECRET_KEY = 'mlfs33^s1l4xf6a36$0#j%dd*sisfoi&)&4s-v=91#^l01v)*j'
                                                                      [ Read 54 lines ]
^G Get Help               ^O WriteOut               ^R Read File              ^Y Prev Page              ^K Cut Text               ^C Cur Pos
^X Exit                   ^J Justify                ^W Where Is               ^V Next Page              ^U UnCut Text             ^T To Spell		

Use your keyboard's down-arrow ket to move to line 33, the once which starts with AWS_ACCESS_KEY_ID.

Use the left-arrow key to move just past the first apostrophe, and type (or copy-and-paste) your Access Key ID in between the two apostrophes.

That line should be changed to something like this, where AWS Access Key ID is replaced with the actual key from your account.

AWS_ACCESS_KEY_ID     = 'AWS Access Key ID'

Do the same with line 34, which starts with AWS_SECRET_ACCESS_KEY and use your Secret Access key there.

Those two lines in the file should look like:

AWS_ACCESS_KEY_ID     = 'AWS Access Key ID'
AWS_SECRET_ACCESS_KEY = 'AWS Secret Access Key'

Move down to line 36, which begins DEFAULT_FROM_EMAIL, and put the email address you used to register with SES there.

The three changed lines should look like:

AWS_ACCESS_KEY_ID     = 'AWS Access Key ID'
AWS_SECRET_ACCESS_KEY = 'AWS Secret Access Key'
...
DEFAULT_FROM_EMAIL = '[email protected]'

If you also created an S3 bucket in step 2, uncomment lines 44 through 46 to enable it. Remove the leading # to reveal these three lines, like this:

DEFAULT_FILE_STORAGE = 'storages.backends.s3boto.S3BotoStorage'
AWS_STORAGE_BUCKET_NAME = 'myveryownformhub'
AWS_DEFAULT_ACL = 'private'

Replace myveryownformhub as the definition for AWS_STORAGE_BUCKET_NAME with whatever name you gave your S3 bucket earlier in step 2.

If you are not using S3, formhub will write all user-submitted data to the /home/fhuser/formhub folder. If you would prefer to use a different location, insert a line and define the MEDIA_ROOT variable, for example like this:

MEDIA_ROOT = '/opt/data/formhub/userdata'

You need to make sure that whatever you define here exists on the file system, and that the fhuser account has permission to write to it. Read more about MEDIA_ROOT and how managing stored files works in django.

Hold down the Ctrl and X keys simultaneous to save the file and exit.

Phew, good job!

You should be back at the now-familiar prompt, looking at this:

admin@ip-10-69-144-231:~$ sudo su - root
root@ip-10-69-144-231:~# su - fhuser
fhuser@ip-10-69-144-231:~$ cd formhub/formhub/preset
fhuser@ip-10-69-144-231:~/formhub/formhub/preset$ cp -ip default_settings.py default_settings.py-bak
fhuser@ip-10-69-144-231:~/formhub/formhub/preset$ nano default_settings.py
fhuser@ip-10-69-144-231:~/formhub/formhub/preset$

Now just type exit and the Enter key three times in a row to close the whole session.

Each time you exit, the terminal will echo logout, to show that you have ended each session.

Here's what the complete terminal session will look like, start to finish:

admin@ip-10-69-144-231:~$ sudo su - root
root@ip-10-69-144-231:~# su - fhuser
fhuser@ip-10-69-144-231:~$ cd formhub/formhub/preset
fhuser@ip-10-69-144-231:~/formhub/formhub/preset$ cp -ip default_settings.py default_settings.py-bak
fhuser@ip-10-69-144-231:~/formhub/formhub/preset$ nano default_settings.py
fhuser@ip-10-69-144-231:~/formhub/formhub/preset$ exit
logout
root@ip-10-69-144-231:~# exit
logout
admin@ip-10-69-144-231:~$ exit
logout
Connection to 54.243.44.203 closed.

Finally, go back to the comforting warmth of the EC2 Dashboard and reboot your Formhub instance.

Click Instances on the left, find your running formhub instance in the list on the main panel, select it with your mouse (left-click), then, once you have the correct instance highlighted in light blue, right-click for the secondary menu and click Reboot:

  1. Changing the django Site object (optional)

By default, django uses example.com as the DNS name of your server.

Among other things, what this means is that the various automated emails formhub sends (e.g., after account registration, password reset, etc.) will point to an address on example.com instead of your site.

To change this, login to your instance using ssh, and type the following terminal commands to get to a django shell:

admin@ip-10-69-144-231:~$ sudo su - root
root@ip-10-69-144-231:~# su - fhuser
fhuser@ip-10-69-144-231:~$ cd formhub
fhuser@ip-10-69-144-231:~/formhub$ python manage.py shell 

If successful, you should be greeted by this message, and the django shell prompt:

Your environment is:"formhub.preset.default_settings"
Python 2.7.3 (default, Mar 13 2014, 11:03:55)
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>>

At the shell prompt, type these commands, and replace myrealdomain.com with the domain name or elastic ip address you are using:

>>> from django.contrib.sites.models import Site
>>> current_site = Site.objects.get_current()
>>> current_site.domain = 'myrealdomain.com'
>>> current_site.name = 'myrealdomain.com'
>>> current_site.save()

Hit control-d or type exit() to exit the django shell. You can now exit your ssh session by typing exit three times:

fhuser@ip-10-69-144-231:~/formhub$ exit
logout
root@ip-10-69-144-231:~# exit
logout
admin@ip-10-69-144-231:~$ exit
logout
Connection to 54.243.44.203 closed.
  1. Create an admin user (optional)

If you wish to use django's automatic admin interface for formhub, you will need to create an administration user account, or superuser.

Login to your instance using ssh, and type the following terminal commands to get to the <tt>createsuperuser</tt> command:
admin@ip-10-69-144-231:~$ sudo su - root
root@ip-10-69-144-231:~# su - fhuser
fhuser@ip-10-69-144-231:~$ cd formhub
fhuser@ip-10-69-144-231:~/formhub$ python manage.py createsuperuser 

You will greeted with the following interactive prompt. Just provide a username, email, and password when asked:

Username: ImAsuperUser
Email address: [email protected]
Password:
Password (again):
Superuser created successfully.

With the superuser account created successfully, you should be able to login to the django admin interface at http://myrealdomain.com/admin (more about the django admin interface).

Signing up for the First Time

After the instance reboots, go back to your browser, and use the elastic IP address for your instance to access Formhub.

Because you have updated the settings, you should be able now to signup as a new user.

After you complete the signup form, you will get an email that looks like this (unless you changed the django Site object in step 6, above, in which case you can skip this):

Django defaults to example.com as its server domain (while you can run Formhub as your own domain, that configuration is out of scope for this guide).

So clicking the activation link as-is will do nothing.

Instead, copy the link into a browser window, and replace example.com with your elastic IP address (again, your actual IP address will be different):

Which should result in this acknowledgement screen:

Now, you can login, and do all the regular Formhub-y things you want.

Enjoy!

⚠️ **GitHub.com Fallback** ⚠️