Deploy instructions - Heroico/MetaXcanWebApp GitHub Wiki

Amazon AWS Setup

The following instructions set up an ordinary ubuntu linux AWS instance to serve the MetaXcan Web App. The underlying PostGRES database should be installed within the instance.

Assume that lines prefixed with $ symbol denote an ordinary bash shell, (.venv)$ deonte a virtual env shell, and > denotes a running python session.

General software tools

$ sudo apt-get install git
$ git config --global user.email "[email protected]"
$ git config --global user.name "username"

$ sudo apt-get install python-pip python-virtualenv libapache2-mod-wsgi pep8 supervisord

AWS's ubuntu has a custom node setup. The following exposes it in a more traditional manner.

$ ln -s /usr/bin/nodejs /usr/bin/node
$ sudo ln -s /usr/bin/nodejs /usr/bin/node

Some development libraries are required for MetaXcan to run

# Care! This is not a full list, you might have to install additionalk libraries at later steps
sudo apt-get install liblapack-dev libblas-dev libatlas-base-dev gfortran

RabbitMQ is used as a message queue for Celery dependency.

$ sudo apt-get install rabbitmq-server
$ sudo rabbitmqctl add_user my_user my_password
$ sudo rabbitmqctl add_vhost my_vhost
$ sudo rabbitmqctl set_permissions -p my_vhost my_user ".*" ".*" ".*"

Apache set up

$ sudo apt-get install apache2

#check that apache is running
$ sudo service apache2 status

PostgreSQL Set up

After installing postgres, you need to create a user and a database for the django application. This will have to be configured in the Django app settings.

Checkout the MetaXcan analysis engine

$ git clone https://github.com/hakyimlab/MetaXcan.git
# So, MetaXcan software will lie at /home/ubuntu/MetaXcan

Checkout the Web App code, and set it up

The following will download the Web App code, and create a virtual env. At the moment of this writing, Python 2.7 is used in the deploy environment, but development was done with Python 3.4.3.

$ cd /var/www/
$ sudo git clone [email protected]:Heroico/MetaXcanWebApp.git
$ sudo chown -R ubuntu:ubuntu MetaXcanWebApp/
$ cd MetaXcanWebApp/
$ virtualenv .venv
$ . .venv/bin/activate
#now the prompt would looks concpetually like this
(.venv)$

Download required python packages into the virtual env:

(.venv)$ pip install -r requisites.txt
(.venv)$ pip install -e git+git://github.com/alanjds/drf-nested-routers@master#egg=drf-nested-routers-master

The previous steps might fail to run and build packages because development libraries are missing from the system. This might require to install additional packages to the ubuntu VM and run again the pip command. For example:

(.venv)$ deactivate
$ sudo apt-get libreadline-dev
$ source .venv/bin/activate
(.venv)$ pip install -r requisites.txt 

Have node download Javascript dependencies

(.venv)$ cd metaxcan_client
(.venv)$ npm install
(.venv)$ cd ..

Django Settings and specific deploy set up

You need to create a folder where files will be upload, and this folder needs to be accessible to apache.

$ mkdir uploaded_files
$ chown -R www-data:www-data uploaded_files

Django settings file needs some modifications to better support a deploy environment. I recommend versioning deploy changes in a local branch.

SECRET_KEY = 'new key'
[...]
DEBUG = False
[...]
COMPRESS_ENABLED = True
[...]
ALLOWED_HOSTS = ["12.34.567.89","localhost"]
[...]
DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql_psycopg2', # Add 'postgresql_psycopg2', 'mysql', 'sqlite3' or 'oracle'.
        'NAME': 'metaxcan',                      # Or path to database file if using sqlite3.
        # The following settings are not used with sqlite3:
        'USER': 'postgres user', #or whatever you setup
        'PASSWORD': 'postgres password', # or whatever
        'HOST': 'localhost',                      # Empty for localhost through domain sockets or           '127.0.0.1' for localhost through TCP.
        'PORT': '',                      # Set to empty string for default.
    }
}
[...]
STATICFILES_FINDERS = (
  'django.contrib.staticfiles.finders.FileSystemFinder',
  'django.contrib.staticfiles.finders.AppDirectoriesFinder',
  #
  'compressor.finders.CompressorFinder',
)
[...]
MEDIA_ROOT = "/var/www/MetaXcanWebApp/uploaded_files" #or whatever you want
[...]
METAXCAN_SOFTWARE = "/home/ubuntu/MetaXcan/working/software"
METAXCAN_PYTHON = "/var/www/MetaXcanWebApp/.venv/bin/python"

Last Steps

(.venv)$ python manage.py collectstatic
(.venv)$ python manage.py migrate

You also need to setup Transcriptome Models and covariance files. This assumes you have uploaded the appropiate files to the VM, and that they are available to the Apache user:

# example
sudo mkdir /var/www/metaxcan_data
sudo mkdir /var/www/metaxcan_data/covariances
sudo mkdir /var/www/metaxcan_data/transcriptomes

#copy your files

chown -R www-data:www-data /var/www/metaxcan_data/covariances/

Then, set it up in the postgres database through django:

(.venv)$ python manage.py shell

> from metaxcan_api.models import Covariance, TranscriptomeModelDB
> c = Covariance(name="Whole Blood, European Reference",path="/path/to/COV_TGF_EUR")
> c.save()```
> t = TranscriptomeModelDB(name="Whole Blood, European Reference",path="/path/to/TW_WholeBlood.db")
> t.save()

Set up celery to run with supervisord

$ sudo mkdir /var/log/celery
$ sudo chown www-data:www-data /var/log/celery/
$ chmod g+w /var/log/celery/
$ touch /var/log/celery/worker.log
$ chown www-data /var/log/celery/worker.log

Edit the Celery configuration file in the Web App:

$ vi celery.confd 

The Celery config file should look like this:

; ==================================
;  celery worker supervisor example
; ==================================

[program:celery]
; Set full path to celery program if using virtualenv
command=/var/www/MetaXcanWebApp/.venv/bin/celery worker -A mwebproject --loglevel=INFO

directory=/var/www/MetaXcanWebApp
user=www-data
numprocs=1
stdout_logfile=/var/log/celery/worker.log
stderr_logfile=/var/log/celery/worker.log
autostart=true
autorestart=true
startsecs=10

; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 600

; When resorting to send SIGKILL to the program to terminate it
; send SIGKILL to its whole process group instead,
; taking care of its children as well.
killasgroup=true

; Set Celery priority higher than default (999)
; so, if rabbitmq is supervised, it will start first.
priority=1000
$ cp celery.confd /etc/supervisor/conf.d/celery.conf

The celery process is run and stopped with:

$ sudo supervisorctl stop celery
$ sudo supervisorctl start celery  

Almost Done!

Now configure Apache to run the Web App:

$ sudo vim /etc/apache2/sites-available/000-default.conf

The configuration must have the following settings beyond the defaults:

    WSGIPassAuthorization On
[...]
    WSGIDaemonProcess metaxcan python-path=/var/www/MetaXcanWebApp:/var/www/MetaXcanWebApp/.venv/lib/python2.7/site-packages/
    WSGIProcessGroup metaxcan
    WSGIScriptAlias / /var/www/MetaXcanWebApp/mwebproject/wsgi.py process-group=metaxcan
[...]
    DocumentRoot /var/www/MetaXcanWebApp/
    Alias /static/ /var/www/MetaXcanWebApp/static_files/

Have apache reload itself so that it picks up changes:

$ sudo service apache2 reload