Deployment on Heroku - StanfordBioinformatics/pulsar_lims GitHub Wiki

This article is about deploying on Heroku for a production instance. You may want to instead test locally for free if you are assessing Pulsar or just want to play around to familiarize yourself with it. The Heroku CLI can be used to test locally via the heroku local command, which is the way that I recommend you test locally. See here for details.

Creating your own Pulsar instance on Heroku

Make sure to have the Heroku CLI installed on your local computer.

Next, clone the current repository to your local workstation:

git clone https://github.com/nathankw/pulsar_lims.git

You need a local checkout when you create your Heroku app, as described below.

A note regarding the web server that Pulsar is configured to use: This repo contains a Procfile in the top-level folder of the repo. Heroku looks for a Procfile in the app's root directory to determine what type of web server to use and how to start it up. If no Procfile is present, then the default web server used will be WEBrick, and you don't want that - it's a single-threaded process and can thus only handle one request at a time. Pulsar LIMS is configured to use the Puma web server by default, as it contains this in its Procfile:

web: bundle exec puma -t 5:5 -p ${PORT:-3000} -e ${RACK_ENV:-development}

Now, in your terminal, cd into the pulsar_lims directory that contains the code you checked out above. Then create an app space on Heroku using the app name of your liking (let's just simply call it "lims" for the sake of this tutorial:

>heroku create lims  
Creating ⬢ lims... done

Creating the app doesn't add the checkout of the codebase yet on the Heroku server. You simply have an app on Heroku called "lims" that just has an empty git repo, which you'll use later on when you do add your codebase. This empty git repo is added as a remote under the name heroku in your local repo.

The web application runs in a dyno, what Heroku defines as a lightweight linux container, which runs on Ubuntu 16.04 (see Heroku stacks for details).

The default dyno type in which your web application runs is the free type. This is too limiting (in terms of physical resources), so go ahead and upgrade it. Below, we upgrade the dyno type to the next level up - the "hobby" type:

heroku dyno:type web=hobby

Above, "web" is specifying the configuration of the dyno (there are web dynos that receive HTTP traffic, worker dynos, and one-off dynos; see See Dyno Configurations). We only have one dyno so far, and that is a web dyno (specified in the Procfile), but we still need to be explicit about which dyno configuration we are scaling up.

Creating the Postgres database

New Rails apps on Heroku by default get a free Postgres add-on on the first code push. There are many different Postgres pricing plans, and the free one is pretty limited. Therefore, you'll want to scale it up. This should be done before pushing any code to Heroku - with Rails apps on Heroku, all add-ons must be specified before the first release is created.

>heroku addons:create heroku-postgresql:standard-0  
Creating heroku-postgresql:standard-0 on ⬢ pulsar-encode... $50/month  
The database should be available in 3-5 minutes.  
! CAUTION: The database will be empty. If upgrading, you can transfer  
!          data from another database with pg:copy.  
Use `heroku pg:wait` to track status.  
postgresql-concave-14404 is being created in the background. The app will restart when complete...  
Use heroku addons:info postgresql-concave-14404 to check creation progress  
Use heroku addons:docs heroku-postgresql to view documentation  

The database is empty right now, and that's normal.

Getting information about your Postgres database

Details on Postgres database can be acquired with the heroku pg:info command, i.e.:

>heroku pg:info
=== DATABASE_URL
Plan:               Standard 0
Status:             Available
Data Size:          7.2 MB
Tables:             0
PG Version:         9.6.3
Connections:        2/120
Fork/Follow:        Available
Rollback:           earliest from 2017-06-05 00:37 UTC
Created:            2017-06-05 00:35 UTC
Region:             us
Maintenance:        not required
Maintenance window: Mondays 18:00 to 22:00 UTC
Add-on:             postgresql-concave-14404 

PostgreSQL interactive terminal

You can use the standard PostgreSQL interactive terminal, psql, with your database by using the following command:

heroku pg:psql

You must have PostgreSQL locally installed for this to work.

PostgreSQL backups

You can create one-off backups or scheduled backups; see examples of each below.

Manual backup:
heroku pg:backups:capture

starts an immediate backup.

Scheduling a backup:
heroku pg:backups:schedule -a pulsar-encode --at "12:00 America/Los_Angelas"

creates daily backups at noon.

There is a plan-based limit on the number of manual backups that an app can have at any one time, likewise for scheduled backups. For example, the Standard plans limits manual backups to 25. For scheduled backups, the last seven days worth are retained (regardless of plan), and there are also weekly backups whose numbers do differ by plan (the last 4 for Standard plans).

Adding the Pulsar source code

Run the following command to push the local master branch of your Pulsar checkout to the heroku remote that was just created for you when you created the Heroku app:

>git push heroku master  

Setting up required environment variables

Now you need to define some environment variables that Pulsar looks for. You set environment variables with the "heroku config:set" command. This will set the environment variables in your application running on Heroku. If you are testing your application by running it locally, see Run app locally with heroku, which includes details on how to setup your environment variables.

S3 bucket environment variables

You need these in order to instruct your Heroku app how to access your S3 Bucket. If you haven't yet set up an S3 bucket for Pulsar to use for uploading some static assets uploaded by users, then now is the time to do so. Also, be sure to set up CORS on your bucket.

>heroku config:set S3_BUCKET=bucket-name AWS_ACCESS_KEY_ID=access-key AWS_SECRET_ACCESS_KEY=secret-key  

Other environment variables to add

  • ADDGENE_WEBSITE - Used in Biosample views, it's the URL to the addgene plasmid repository (https://www.addgene.org).
  • APP_EMAIL_ADDR - The "from" email address to use when the application sends out emails. This is normally an email address that users don't respond to. You can set this to the following format: appName@MailgunDomainName. Thus, for Pulsar ENCODE, the value is set as [email protected].
  • APP_NAME - Set this to the user friendly name of your app. Currently this is referenced in app/mailers/user_mailer.rb so that when new users get an email with a confirmation link, the user friendly app name is included in the message.
  • APP_HOST_NAME - Stores the value of the host domain name, i.e. pulsar-encode.herokuapp.com. This is used in the mailer configuration.
  • SUPPORT_EMAIL_ADDR - Set this to a listserv or the email address of whoever is to receive support questions related to the application. This is also currently referenced in app/mailers/user_mailer.rb.
  • SECRET_KEY_BASE - This is an encryption key used by RAILS to encrypt session cookies. You should run rake secret to get an encryption key and set that as the value. See wiki section on encrypted cookies for more details.
  • WIKI - A link to your site's documentation.

Inspecting environment variables

You can inspect the environment variables that are set in production with the "heroku config" command. Included in that output will be some that you didn't explicitly set, one of which is DATABASE_URL. This was set automatically when you enabled the Postgres addon, and it instructs your app where to find your database.

Loading the schema and seed data

Now that you have a Postgres database, you now need to load the schema and any seed data you may have.

heroku run rake db:setup

The above command creates the schema for your database, using as input the schema file found in db/schema.rb. Then, the command loads your database with the seed data found in db/seeds.rb.

Opening your web app

heroku open

Email support

Pulsar uses email verification when new users signup. Pulsar expects the Mailgun addon to be used. Using a different one is a matter of changing some configuration variables. Follow along here to get setup with email support.

Using the RAILS console or runner with your production database

By default, the command rails c will start up the RAILS console using your development environment. Or, if you are running a script using rails runner, this will also connect to your development environment. Most often, that means connecting to the associated database that is locally on your machine. This could be sqlite (the default database when starting a new app), or perhaps MySQL or PostgreSQL if you specified using either of these in your database.yaml file. As for Pulsar, it is using PostgreSQL in all environments, and once you get to deploying an instance of Pulsar you'll often find the need to use rails c or the rails runner while connected to your production (deployed) database. Perhaps the easiest way to achieve that is to set the DATABASE_URL environment variable as follows:

  1. At the command-line, run heroku config and copy the value of the DATABASE_URL environment variable (this should already be set as a production config variable as described earlier).
  2. Run export DATABASE_URL=$paste-contents-here, where "$paste-contents-here" represents the value of what you copied above.

Now that you have an environment variable in your terminal session, rails c and rails runner will connect to your production database. For example, when you type rails c, you will see the typical output:

rails c
Loading development environment (Rails 4.2.8)

however, you will be connected to your production database, despite the output still mentioning the development environment. You can verify this with some queries. See https://devcenter.heroku.com/articles/heroku-postgresql#connecting-in-rails and http://guides.rubyonrails.org/configuring.html#configuring-a-database section 3.1.4 Configuring a Database for reference.

It is also worth mentioning that one can use heroku run bash to start a BASH terminal session that runs in the context of your deployed environment. This works by starting up a one-off dyno. A one-off dyno has its own ephemeral file system with your deployed code, so changes here are never persisted. From here, you could easily do rails c and be connected to your production database. However, its not always as convenient. For one, you'll be amassing charges for the extra dyno resources that you are consuming. Secondly, you can't copy files to a one-off dyno, such as a one-off script or an input file to a such a script, since these dynos are not allowed to receive any HTTP traffic. For example, if you write a one-off script making use of some summary query, then it's easier to connect to your production database using the first method detailed and then run your script locally, otherwise you'll probably have to commit the one-off script to the version-controlled repository, and then push that to heroku with git push heroku master so that you can have access to it in your one-off dyno.

Creating a staging app on Heroku

A production app should always pair with a staging app in the same environment for experimenting with new features and testing bug fixes, and on Heroku it's pretty simple to set up too. You essentially need to just create a new app and rerun a few of the commands noted earlier. Below is the quick steps guide. Here, I call my staging app the same name as my production app, but prefixed with "staging".

  1. heroku create staging-pulsar-encode --remote staging
  2. heroku addons:create heroku-postgresql:standard-0
  3. heroku addons:create foundelasticsearch:dachs-standard
  4. heroku pg:copy pulsar-encode::DATABASE DATABASE -a staging-pulsar-encode

Note 1: In step 1, a new app on Heroku is created and the git remote is name staging. By default, the remote is named "heroku". This matters when you push you code to your app. In this case, if I have new code to send to my staging app, the command would typically look like git push staging master, or some local branch name instead of master. Furthermore, a handy command to change your remote name is git remote rename. For example, I used this to change my production app's git remote name from heroku to production with the command git remote rename heroku production.

Note 2: You don't need another mailgun addon that is specific to this staging app. You can reuse the one you already have, just be sure to copy all the MAILGUN environment variables over from your production app. Furthermore, you should copy over the APP_EMAIL_ADDR and change the value to reflect your staging app name (i.e. my production value is set to [email protected]; thus, I set my staging value to [email protected]).

Note 3: Step 4 demonstrates how you can copy your production database to your staging database to do a data dump. When you run this command on your own apps, the only things to change should be your production app name and your staging app name.

You should also set up remaining environment variables that were previously discussed. Basically, all the environment variables in production (output from heroku config) should be present in the staging app. Just take care to customize the values for certain variables that should be different between the two, such as APP_EMAIL_ADDR, APP_NAME, and APP_HOST_NAME. You should also set a new secret key as the value of the SECRET_KEY_BASE variable.

Restoring your production database to your local development database

If you have a local deployment running on your machine, sometimes it's useful to have your local database in sync, at least momentarily, with your production database. Here are the high level steps that can be used to achieve this:

  1. Create a database dump of production
  2. Remove your local development database if it already exists by the name you want to restore into.
  3. Create an empty database using the name that you want to restore into.
  4. Restore your production database into your local database.

And here are the actual commands:

  1. Stop your local rails app and close all connections to your database.

  2. heroku pg:backups:download. This will download the latest database dump to a file named latest.dump. You can create a database dump with the command heroku pg:backups:capture.

  3. Lookup the name of your development database in your config/database.yaml file in your rails app; lets refer to this name with the variable $dev_db.

  4. Run the command-line script dropdb with the name of the development database as the only positional argument: dropdb $dev_db. When prompted for a password, use the admin password you set up when you install Postgresql.

  5. Run the command-line script createdb as follow: createdb $dev_db. When prompted for a password, use the admin password you set up when you install Postgresql.

  6. Run the command-line script pg_restore as: pg_restore --verbose --clean --no-acl --no-owner -h localhost -U pulsar -d pulsar_dev ~/latest.dump. When prompted for a password, use the password specified in your app's config/database.yaml file. See Heroku documentation for more details.

  7. Start up your local deployment and you should be good to go!

Working with the staging app

There are two deployments of Pulsar LIMS - one for production and another for staging. The staging deployment is used for testing new changes out. To start working in staging, create a separate clone of the pulsar_lims git repository. You should have one clone for production, and another for staging. Clone the repository git clone https://github.com/nathankw/pulsar_lims.git Then set the Heroku remote tracking branch with
heroku git:remote -a pulsar-encode-staging And finally, rename the remote tracking branch from the default of "heroku" to something more meaningful, i.e. staging, with: git remote rename heroku staging There is more documentation on this on Heroku.

Creating a database dump from production to staging

This is necessary to synchronize the databases from time to time.

Follow these steps:

  1. cd into staging clone of the repository that you set up as stated above.
  2. Copy the production database over to the staging database, replacing the one in staging, by running the command heroku pg:copy pulsar-encode::DATABASE DATABASE -a staging-pulsar-encode
  3. Next, you need to recreate all the Elasticsearch indices in the staging app. To do so, first start a one-off Heroku dyno with heroku run bash. Once you get into the dyno, go into the app's lib directory. There you'll see a file called rails_console_utils.rb. Execute this by typing ruby rails_console_utils.rb.