System Configuration - genomizer/genomizer-documentation GitHub Wiki

A brief introduction to vagrant

Notice: this describes the old vagrant-based setup. We have now moved to docker.

Basic usage

Vagrant is configured by a Vagrant file, which defines how the virtual machine is constructed. This should not be modified unless you understand what you are doing.

To start a vagrant instance, run vagrant up. The vagrant instance will now either (a) or (b):

  1. Create a new virtual machine with correct configuration
  2. Start an already built virtual machine

If the instance is stale or needs to be rebuilt, it can be achieved with

vagrant destroy
vagrant up && vagrant ssh -c 'bash startup.sh'

If you simply run the server using vagrant up, then vagrant ssh -c 'bash startup.sh' starts the genomizer server in the virtual machine, inside a screen session labeled server.

Modifying the configuration

To make vagrant set up a specific branch, modify the corresponding script, such as scripts/provision/install_genomizer_server.sh to checkout yourBranch instead of develop.

If your feature requires changes to the settings.cfg it is located at scripts/provision/config/settings.cfg and is installed along with a fresh instance of the virtual machine.

Entering the vagrant virtual machine

If you need to reach the environment using SSH, simply go to the environment you require with

cd development-1
vagrant ssh

and you will be logged in using ssh to the virtual machine.

Systems overview of production

The production server runs currently on 130.239.192.110. The production server is construct of several virtual machines, configured by vagrant[1].

Illustration of system setup

The system is entirely scripted, that is, everything can be set up using the scripts located in the vagrant git[2]. The scripts located under scripts/provision deal with the virtual environments, while the scripts under scripts/environment deal with the server, such as setting up the firewall and external directories used by the virtual machines.

Using the toolchain

This is a short introduction to the tools used in the environments.

Using gnu screen

To ensure processes run even when nobody has a terminal open to the server screen is used. screen is a terminal multiplexer, which could be viewed as a in-terminal window manager. To start a new screen you run the command

$ screen

which then starts a screen and attaches your current terminal to it. screen is controlled by prefixing a command with a modifier. For example, to detach from a screen window, you use <C-a> which you then release, indicating you wish to send a command to screen. You then press d, which is the hot-key for detach. You are now detached from the screen.

You might not be surprised when I tell you that several screen windows can be running at the same time. To list the currently running screen sessions, you run the command

$ screen -list

If there is only one session running, you can attach to it using simply the command

$ screen -r

to attach. If there are several running screens, you must specify which session to attach to. The following is an example of this. It starts two sessions, first and second in a detached state. The -dmS flag means they are started in a detached state. We then list the sessions and see that several are running, as expected.

[vagrant@localhost ~]$ screen -dmS "first"
[vagrant@localhost ~]$ screen -dmS "second"
[vagrant@localhost ~]$ screen -list
There are screens on:
    15073.second    (Detached)
    15053.first (Detached)
2 Sockets in /var/run/screen/S-vagrant.

We can now attach to these sessions by running the command, replacing <session-name with the name of your session.

$ screen -S <session name>

To terminate a screen, attach to it and run exit.

Configured environments

The environments as currently configured are described below.

Production

This environment runs the production code. Hands off if you are unsure of or have not communicated your change clearly to everyone else.

  • Virtual hardware
    • 45000 MB RAM
    • 8 CPU Cores
  • Ports
    • Genomizer-server: 7000
    • Http: 80
    • Https: 443
  • Local IP: 192.168.33.10
  • Directories
    • Temporary files
      • External directory: /Data/tmp
      • Internal directory: /tmp
    • Data files
      • External directory: /Data/production-data
      • Internal directory: /data

Development-1

This environment is assigned to the database group.

  • Virtual hardware
    • 4000 MB RAM
    • 1 CPU Core
  • Ports
    • Genomizer-server: 7001
    • Http: 8081
    • Https: 4431
  • Local IP: 192.168.33.11
  • Directories
    • Temporary files
      • External directory: /Data/development-1-tmp
      • Internal directory: /tmp
    • Data files
      • External directory: /Data/development-1-data
      • Internal directory: /data

Development-2

This environment is assigned to the business-logic group.

  • Virtual hardware
    • 4000 MB RAM
    • 1 CPU Core
  • Ports
    • Genomizer-server: 7002
    • Http: 8082
    • Https: 4432
  • Local IP: 192.168.33.12
  • Directories
    • Temporary files
      • External directory: /Data/development-2-tmp
      • Internal directory: /tmp
    • Data files
      • External directory: /Data/development-2-data
      • Internal directory: /data

Development-3

This environment is assigned to the processing group.

  • Virtual hardware
    • 4000 MB RAM
    • 1 CPU Core
  • Ports
    • Genomizer-server: 7003
    • Http: 8083
    • Https: 4433
  • Local IP: 192.168.33.13
  • Directories
    • Temporary files
      • External directory: /Data/development-3-tmp
      • Internal directory: /tmp
    • Data files
      • External directory: /Data/development-3-data
      • Internal directory: /data

Client-development

This environment is assigned to the desktop group.

  • Virtual hardware
    • 4000 MB RAM
    • 1 CPU Core
  • Ports
    • Genomizer-server: 7004
    • Http: 8084
    • Https: 4434
  • Local IP: 192.168.33.14
  • Directories
    • Temporary files
      • External directory: /Data/development-client-tmp
      • Internal directory: /tmp
    • Data files
      • External directory: /Data/development-client-data
      • Internal directory: /data

Web-development

This environment is assigned to the web and app groups.

  • Virtual hardware
    • 4000 MB RAM
    • 1 CPU Core
  • Ports
    • Genomizer-server: 7005
    • Http: 8085
    • Https: 4435
  • Local IP: 192.168.33.15
  • Directories
    • Temporary files
      • External directory: /Data/development-web-tmp
      • Internal directory: /tmp
    • Data files
      • External directory: /Data/development-web-data
      • Internal directory: /data

The important scripts

The machines are configured using a myriad of scripts with various responsibilities. The Vagrantfile defines how the virtual machine is built. This includes which provisioning files are to be executed, and how much memory and resources to give the machine. It also configures the port forwarding into the machine, and which shared data folders are available to the machine. Messing with these configurations is inadvisable.

The Vagrantfile runs, as previously said, configuration scripts. Their names and basic responsibilities are listed below. Unless otherwise specified, they are located under scripts/provision.

  1. install_apache.sh - Installs and configures the apache server. It installs the config files httpd.conf, ssl.conf, and proxy.conf located under scripts/provision/config, along with running the install_certificates script.
  2. install_postgresql.sh - While the name may seem confusing, this installs puppet and the puppet/postgresql module. This allows the Vagrantfile to run puppet provisioning on the machine to install postgresql with the settings provided in manifests/build.pp.
  3. install_certificates.sh - This is an expect script, which is run by the install_apache script. This is introduced to deal with the fact that you need to respond to questions to generate a SSL certificate. It generates a SSL certificate, and moves it to the correct location.
  4. install_utils.sh - There are some tools and utilities which were requested or needed by the scripts or persons working on the project. They are installed from this script.
  5. install_startup_scripts.sh - Installs various scripts that are used for server administration.
  6. install_genomizer_server.sh - Installs the genomizer-server
  7. install_genomizer_webclient.sh - Installs the genomizer-web client to the apache server.

If you are confused by what a script does, reading it is a good start.

Creating a new environment

Considerations

Before setting up a new environment, you should consider the memory and CPU footprint inherent in running a virtual machine. The host machine as currently configured can run 6 environments with no noticeable slowdown, with five machines at 1 cpu core and 4GB of RAM, and a production environment that is assigned 8 CPU cores and 45GB of RAM. It is not recommended to exceed this configuration, as unexpected side-effects could occur.

Checking out the correct files

When building a new environment, you may wish to configure which versions of the server software and web client software the script installs. By default, the scripts are configured to fetch the master branch of the genomizer-server and the master branch of the genomizer-web application. These settings are modified by editing the scripts/provision/install_genomizer_server.sh and scripts/provision/install_genomizer_webclient.sh respectively.

In order to change the version of the genomizer-server, edit the scripts/provision/install_genomizer_server.sh script. Locate the line(s)

# We checkout the master version by default
git checkout master

and replace them with

# We checkout the <desired branch> version by default
git checkout <desired branch>

To change the version of the genomizer-web client, edit the scripts/provision/ install_genomizer_webclient.sh script. Locate the line(s)

# We run the master branch by default
cd genomizer-web
git checkout master

and replace them with

# We run the <desired branch> branch by default
cd genomizer-web
git checkout <desired branch>

Setting up the files

When setting up the new environment, you may (or may not) want to run it with the same settings as the production environment. In this section all relevant settings are explained and motivated.

Editing the Vagrantfile

In the vagrant file, the primarily interesting settings are concerning port forwarding, local ip, synced folders, and memory. Given that you wish to customize the environment, the provision settings might be of use as well. The Vagrantfile is written in a Ruby DSL.

The port forwarding settings looks as follows.

config.vm.network "forwarded_port", guest: 80, host:8080
config.vm.network "forwarded_port", guest: 7000, host:7000
config.vm.network "forwarded_port", guest: 443, host:4443

The host field defines which port the virtual machine binds on in the host machine, while the guest defines the port the virtual machine it binds on internally.

To define which local IP the virtual machine runs on, you modify the line below to suit your needs. Note that the IP must be unique, and it is recommended to stay inside the 192.168.33.x subnet.

config.vm.network "private_network", ip: "192.168.33.10"

You may also need to mount a folder from the host machine as a drive inside the virtual machine. This is achieved by editing or adding to the lines shown below. The first argument is the hosts path to the folder to share, and the second is where the virtual machine should mount it.

config.vm.synced_folder "/Data/production-data", "/data"
config.vm.synced_folder "/Data/production-tmp", "/tmp"

The actual hardware specifications given to the virtual machine is defined by these lines:

config.vm.provider "virtualbox" do |vb|
    vb.memory = "45000"
    vb.cpus = 8
end

These lines define that the machine should run with 45000 MB of RAM, and run on 8 CPU cores. If other settings are desired, you modify these lines. For example, the development environments might look like

config.vm.provider "virtualbox" do |vb|
    vb.memory = "4000"
    vb.cpus = 1
end

Editing the settings.cfg

The settings.cfg file contains the settings that are installed to the genomizer-server. These should need no modification, except perhaps to edit the tunneling settings.

If you however do want to edit some settings, note that the database settings must correspond to the ones configured in build.pp, else the server will fail to connect to the database.

The settings available in the server settings are straightforward, and are commented such that no further explanation is required.

Editing the httpd.conf

The httpd.conf file is dead-standard, except for the very last few lines. These lines assures that SSL is forced, and that any traffic connecting to the non-SSL apache is told to reconnect with SSL.

<VirtualHost *:80>
    RedirectPermanent / https://130.239.192.110
</VirtualHost>

You may wish to edit where this reroutes to as needed.

Modifying the build.pp

The build.pp file details how the postgresql database is constructed, along with which sql files are automatically run on the server. It is not recommended to change how this file functions except to change the password and modify which sql files are run.

Creating the tmp and data folders

When constructing an environment, the machine requires a place to store temporary files, and a place to store large data files. These are by default placed in /Data/<environment>-tmp and /Data/<environment>-data. To create these folders, and give them the correct permissions, you should run the following commands. The <your-user> should be replaced with the user that runs the virtual machines, and <environment> with the name of your environment.

[user@your-environment]$ sudo mkdir /Data/<environment>-tmp
[user@your-environment]$ sudo mkdir /Data/<environment>-data
[user@your-environment]$ sudo chown -R <your-user> /data/<environment>-tmp
[user@your-environment]$ sudo chmod -R 1777 /data/<environment>-tmp
[user@your-environment]$ sudo chmod -R 777 /data/<environment>-data

Running the new environment

Once you have created an environment, you need to allow vagrant to construct the associated virtual machine. This is done by running vagrant up in the directory associated with your environment.

Vagrant should now start building the virtual machine, along with provisioning it. It will write a lot of information to the terminal you run it in, and this is perfectly normal. The build takes on average 5-10 minutes to complete.

Once it has finished, the environment is up and running, except for one detail: genomizer-server is not running. You can start it by typing the following into the terminal.

[user@environment]$ vagrant ssh
Last login: Tue May 12 12:4032 2015 from 10.0.2.2
[vagrant@localhost]$ bash startup.sh

If you wish to view the server process, this can be done by running

[vagrant@localhost]$ screen -r

which attaches to the screen created by startup.sh.

Modifying an existing environment

While it is very useful to build an environment from scratch, you do not always want to do everything from scratch. You want to modify an existing environment. Modifying an environment is fairly straight forward, you simply edit the configuration files, as shown in Creating a new environment. Once you have modified the files according to your preferences, you run

[user@your-environment]$ vagrant destroy && vagrant up

if you wish to completely rebuild the machine. If you simply want the machine to reload the Vagrantfile settings, you can run

[user@your-environment]$ vagrant reload

Checkout new branches

If you decide that you wish to run another branch in the environment, and do not wish to rebuild the environment, you can simply ssh into the vagrant machine and checkout using git as you normally would, compiling and running as you would normally.

[user@your-environment]$ vagrant ssh
[vagrant@localhost]$ cd genomizer-server
[vagrant@localhost]$ git stash
[vagrant@localhost]$ git checkout <branch>
[vagrant@localhost]$ git stash pop
[vagrant@localhost]$ ant clean build jar
[vagrant@localhost]$ java -jar server.jar

In this context, the stash ensures that our modified settings.cfg is preserved. If for some reason settings.cfg is overwritten, you can get a fresh copy from /vagrant/scripts/provision/config/settings.cfg inside the virtual machine.

Rebuilding an environment

Rebuilding an environment is dead simple, simply run

[user@your-environment]$ vagrant destroy && vagrant up

Deleting an environment

NOTE: The changes explained in this chapter are irreversible!

To delete an environment it is not sufficient to simply remove the folder containing the environment. The virtual machine must also be deleted. The following steps removes an environment cleanly.

  1. Enter the environment folder.
  2. Destroy the virtual machine with vagrant destroy.
  3. Remove the .vagrant folder.
  4. Remove the environment folder.
  5. Remove the environments tmp and data folders.

The tmp and data folders are by default located under /Data/<environment>-tmp and /Data/<environment>-data.

Configuring the host system

Please note, all of these things can be achieved by running the script scripts/environment/setup_host_environment.sh, provided a CentOS host system. Never run this script without reading it through. Seriously. Even if you want to configure manually, reading this script is useful to get examples of the commands that are relevant.

External folders and their permissions.

The host machine is required to make available storage space for the virtual machines. By default this is assumed to be /Data. In this folder, you should place tmp and data folders -- one for each environment -- with the permissions 1777 for tmp. The data folder should have the permissions 777. The tmp folder must be owned by the vagrant user.

Configuring the firewall

The firewall needs to allow at least the ports 80, 443 and 7000. There is a script which configures these settings under scripts/environment/iptables_config.sh. The only advanced setting here is a workaround to patch reserved ports into the virtual machines without requiring root permissions.

The firewall is instructed to reroute traffic to port 80 to the port 8080 and traffic to the port 443 to the port 4443 which then the virtual machine can bind on.

Illustration of rerouting

Specifics on how to run these commands are illustrated in iptables_config.sh.

Prerequisite software

The setup requires vagrant and virtualbox to be installed.

Virtualbox

To install virtualbox, you can run the following commands. Replace $VAGRANT_USER with the vagrant user.

# Get the EPEL repo
rpm -Uvh https://download.fedoraproject.org/pub/epel/7/x86_64/e/\
    epel-release-7-5.noarch.rpm

# Install kernel headers and devel tools
yum -y install kernel-devel kernel-headers dkms
yum groupinstall "Development Tools"
yum update

# Install oracle public key
wget http://download.virtualbox.org/virtualbox/debian/\
    oracle_vbox.asc
rpm --import oracle_vbox.asc
rm -rf oracle_vbox.asc

# Add the virtualbox repo
wget http://download.virtualbox.org/virtualbox/rpm/el/\
    virtualbox.repo -O /etc/yum.repos.d/virtualbox.repo
yum update # Update repo info for safety

# Time to actually install virtualbox
yum -y install VirtualBox-4.3
service vboxdrv setup # Setup vbox driver
usermod -a -G vboxusers $VAGRANT_USER # Add VB user

Vagrant

To install vagrant, you can run the following commands.

wget https://dl.bintray.com/mitchellh/vagrant/\
    vagrant_1.7.2_x86_64.rpm
rpm -ivh vagrant_1.7.2_x86_64.rpm
rm -rf vagrant_1.7.2_x86_64.rpm

[1] Vagrant is a virtual machine configuration manager. The manual is available at https://docs.vagrantup.com/v2/

[2] Insert link here.

⚠️ **GitHub.com Fallback** ⚠️