Installing psQTL - zkstewart/psQTL GitHub Wiki

Installing conda

If you are unfamiliar with running Python programs, a great starting point is to use conda. Conda provides easy access to program packages, and allows for virtual environments to be created to ensure compatibility of programs within these environments.

Conda is available on all operating systems, and can be obtained as the main Anaconda distribution or as Miniconda. The main distribution comes with most packages that psQTL relies upon, however there may be licensing constraints since it makes use of packages that are part of the official Anaconda channel. Miniconda, on the other hand, avoids the use of packages on this channel and, if you stick to community channels like conda-forge or bioconda, you can use conda without licensing concerns. Refer to the Anaconda FAQ site if you think there's any chance this might apply to you.

For help with installing this, see the Anaconda documentation.

Setting up a conda environment for psQTL

First, configure your conda channel to ensure that you are obtaining public channel packages, and avoid any licensing issues. This may not be a concern for you, but if it's just as easy, why not?

conda config --add channels bioconda
conda config --add channels conda-forge
conda config --set channel_priority strict

Now, run the below command to set up a new environment for running psQTL.

conda create -n psqtl python -y
conda activate psqtl

For now, we just want a blank environment containing Python. We will build on this shortly.

Validating Python installation

At this point you should have a working Python installation. In your command terminal, if you enter:

python --version

You should see some message akin to:

Python 3.X.X

You should make sure your Python version is 3.11 or higher, as psQTL has not been validated to be compatible with older versions.

Installing Python prerequisites

The next step is to install the Python packages as listed on the main README page. The below command should do this all automatically.

conda install biopython numpy pandas matplotlib pycirclize qt6-wayland -y # conda-forge
conda install samtools bcftools vt -y # bioconda
pip install setuptools
pip install ncls

Installing R prerequisites

If you intend to use the sPLS-DA functionality of psQTL, you will also need to have an R installation. R version 4.5 was used during development, and hence it is recommended that you have a version at least this recent. The below commands should set R up for you automatically.

conda install r-base
Rscript -e 'install.packages(c("argparser", "BiocManager"), repos="https://cloud.r-project.org")'
Rscript -e 'BiocManager::install("mixOmics")'

Obtaining psQTL

Now that you have an environment ready to run psQTL, we need to obtain the software. You can do that by going to a location you'd like to store psQTL's code, then cloning the repository like:

git clone https://github.com/zkstewart/psQTL.git

As a collection of Python scripts, you do not need to compile the software.

Validating psQTL installation

At this point, psQTL should be available to run. You can check this by requesting help information from each BINge script. To begin, do the below:

cd psQTL
python psQTL_prep.py -h

You should see a message which begins with:

usage: psQTL_prep.py [-h] [-v] {initialise,init,depth,call,view} ...

Note: you may see the warning message UserWarning: pkg_resources is deprecated as an API.; other than being a slight nuisance, it is not a problem with psQTL and will not affect its operations.

You should also test the other two psQTL scripts to check that they show help information:

python psQTL_proc.py -h
python psQTL_post.py -h

Anything else?

If you see some other message when asking for program help information, make an issue in this repository and I will give you some guidance on what might be going wrong.

Otherwise, you are now ready to use psQTL! Feel free to start on that immediately, or go to the Using psQTL page if you need some guidance on how to do this.