Arivale - Gibbons-Lab/wiki GitHub Wiki
Arivale ISB Analytics images on Dalek
Dalek come configured with an analytics docker setup that provides the following two images:
analytics2 The Jupyter environment currently containing Python 3 and R environments.
analytics2-rstudio An alternative Rstudio environment which is leaner and probably more familiar for the R users.
Contacts
- Hood Lab: Noa Rappaport
- Gibbons Lab: Christian Diener
How do I use it?
The Arivale research server
In order to work with data from Arivale you will need the following:
- Get a user account for the research server from Noa Rappaport
- Start a research docker container on the server for you
Step 2 can be performed by almost any member of the Arivale research group so let us know if you would like help :)
Connecting to the server
You need to connect via SSH. SSH is always available on Mac and Linux but you will need to install a client on Windows (for instance Putty).
To connect to the Arivale server you have to be in the internal ISB network or connected via VPN.
Getting around the server
All analysis of Arivale data has to happen on the server. Data should never leave the server. How that works is that you will start a docker research container on the server which will give you access to Jupyter via your browser. You can then run all analyses directly on the server. Get in contact if you need more than the resources currently provided.
On the server there two locations that are potentially important:
~/notebooks
This is your personal folder of research notebooks. It includes some tutorial notebooks and this is where your notebooks get saved to in the research container (default location).
/home/cache/libs/docker-research/scripts/notebooks
You can change into this directory via the command line using
Starting up an Analytics environment
By default only a single environment is supported for any single user. See below how to request more resources.
First log in to the Arivale research server as described above.
You can start your research environment using the analytics2
script, which
is used as summarized as following:
> analytics2 -h
usage: analytics2 [-h] [--ports PORT_FILE] [--version]
{start,stop,restart,logs,config} ...
Manage your analytics image.
optional arguments:
-h, --help show this help message and exit
--ports PORT_FILE, -p PORT_FILE
a file mapping users to ports
--version, -v show program's version number and exit
subcommands:
See `analytics2 CMD -h` for command specific help.
{start,stop,restart,logs,config}
start start a new analytics container
stop stop a running analytics container
restart restart a clean analytics container
logs show logs for a running analytics container
config show your configuration
Thus in order to run a new environment use:
> analytics2 start
In the same vein use restart
or stop
to restart or stop your environment.
This will persist any files in the standard locations (/notebooks
in the
Jupyter environment and /home/rstudio
, which is default in Rstudio).
This will reset the installed software to the factory state.
Note that any files in non-standard locations will be lost this way.
So please make sure to backup/download all relevant files before restarting or
stopping an environment.
analytics2
)
Using the Jupyter environment (After starting your environment you can access it from within the
ISB internal network with any browser using
https://dalek.systemsbiology.net:PORT , where PORT
is the port
assigned to you (ask us or use analytics2 config
if you do not know your
port). You may have to create a security exception for the self-signed
SSL certificate. The password is research1
.
You will be presented with a launcher that lets you run notebooks in any of the environments or start up a terminal for administrative tasks.
For notebooks only the environments starting with arivale-
are correctly
configured and set up so please use those.
Installing software
If you use a package/software regularly consider adding it to the environment permanently as described below in section "Requesting software to include".
For a non-permanent installation do the following:
- Open a terminal in the launcher
- Activate the environment you want to install to. for instance for Python 3
use
arivale-py3
:
source activate arivale-py3
The prompt of you terminal will change to reflect the active environment.
- Install packages with conda, for instance
conda install tensorflow
Using the Rstudio environment
Start your environment using the --rstudio
flag. For instance by using
analytics2 --rstudio start
After starting your environment you can access it from within the
ISB internal network with any browser using
https://dalek.systemsbiology.net:PORT , where PORT
is the port
assigned to you (ask us or use analytics2 config
if you do not know your
port). You may have to create a security exception for the self-signed
SSL certificate. The password is USERandarivale
where USER
is your username
on the server and the username is `rstudio.
Any files in the default directory opened by Rstudio (/home/rstudio/
) will
be persisted after deleting the container.
Installing software
Use Tools > Install Packages...
.
Observing resource usages
All containers are managed by docker, which you can use to inspect them. For instance to see which containers are running use:
docker ps
on the server.
To see a list of resource usages on the server:
docker stats
You can exit the resource view with Ctrl-C
.
How do I adapt it to my needs?
Requesting software to include
If you use a particular package/software regularly it may be a good idea to include it by default as this aids reproducibility. To do so, first request read access to the Github repository. You can then request new software by a Pull Request to either of the following files:
python3.yml
: Packages for Python 3r.yml
: Packages for R in Jupyterrstudio/packages.txt
: packages for Rstudio
If you want to request R packages please add them to both R package lists.
Custom startup scripts
The analytics2 script also serves as a Python module that allows you fine-grained control over the startup progress. for instance to access the configuration:
from analytics2 import build_config
config = build_config()
print(config["port"])
Or to start a container with a changed config and a different image:
from analytics2 import build_config, start_container
config = build_config()
config["port"] = "8123"
start_container(config, "my-analytics2")
For developers
Code can be found at https://github.com/gibbons-lab/arivale_docker.
The script and package are distributed along with the docker images. If you change the script make sure to run the tests before pushing:
python -m pytest
in the root directory. This will check if the script still runs. You can update releases by running
bumpversion {major,minor,patch}
on a clean git branch.
To build the images just run make
which will download updated base images build both images and tag them
automatically with a date-based versioning (YY.MM).