Installing psQTL - zkstewart/psQTL GitHub Wiki
Installing conda
If you are unfamiliar with running Python programs, a great starting point is to use conda. Conda provides easy access to program packages, and allows for virtual environments to be created to ensure compatibility of programs within these environments.
Conda is available on all operating systems, and can be obtained as the main Anaconda distribution or as Miniconda. The main distribution comes with most packages that psQTL relies upon, however there may be licensing constraints since it makes use of packages that are part of the official Anaconda channel. Miniconda, on the other hand, avoids the use of packages on this channel and, if you stick to community channels like conda-forge or bioconda, you can use conda without licensing concerns. Refer to the Anaconda FAQ site if you think there's any chance this might apply to you.
For help with installing this, see the Anaconda documentation.
Option 1: Automatically install psQTL as a conda package
psQTL has its own package available through the bioconda channel. You should install it as its own environment to ensure compatible program versions. Try using a command like the following.
conda create -n psqtl bioconda::psqtl
This should automatically handle all prerequisites for using psQTL. You can then test that psQTL is available by invoking the executables like:
psQTL_prep -h
psQTL_proc -h
psQTL_post -h
This differs to how you run psQTL when obtaining it as demonstrated in option 2 below, where the psQTL scripts must be invoked as python psQTL_prep.py -h for example. If you see this method of invocation being used anywhere in this documentation, you should adapt any commands to account for this change.
Option 2: Manually set up a conda environment for psQTL
Option 2.1: Create conda environment
If you wish to use psQTL by cloning this GitHub repository, rather than obtaining the software as a conda package as described in Option 1, you should first configure your conda channels to ensure that you are obtaining public channel packages. This will help to avoid any licensing issues. This may not be a concern for you, but if it's just as easy, why not?
conda config --add channels bioconda
conda config --add channels conda-forge
conda config --set channel_priority strict
Next, we will set up a conda environment inclusive of all software and packages detailed at the Prerequisites page. We will start by obtaining the Python packages and external bioinformatic software using the command below:
conda create -n psqtl "python >=3.11,<=3.13" biopython numpy pandas matplotlib-base pycirclize qt6-wayland samtools bcftools vt "ncls >=0.0.68" -y
conda activate psqtl
At this point you should have a working Python installation. In your command terminal, if you enter python --version you should see a message akin to:
Python 3.X.X
You should make sure your Python version is 3.11 or higher, as psQTL has not been validated to be compatible with older versions. For now, versions higher than 3.13 are not explicitly supported in part due to conda package incompatibilities. Additionally, the author would need to perform adequate testing as newer Python versions become available to ensure correct functionality.
Optionally, if you intend to use the sPLS-DA functionality of psQTL, you also need to have an R installation. The below command should set up R and all necessary packages for you automatically:
conda install "r-base >=4.3.0" r-argparser r-dplyr r-stringr r-biocmanager bioconductor-biocparallel "bioconductor-mixomics >=6.26.0" -y
Option 2.2: Obtain the psQTL source code
Now that you have an environment ready to run psQTL, we need to obtain the software. You can do that by going to a location you'd like to store psQTL's code, then cloning the repository with:
git clone https://github.com/zkstewart/psQTL.git
Alternatively, you can obtain the latest release from the releases section of this repository and decompressing the archive at a suitable location.
As a collection of Python scripts, you do not need to compile the software, and it should be available to run immediately. You can check this by requesting help information from each psQTL script. To begin, do the below:
cd psQTL
python psQTL_prep.py -h
You should see a message which begins with:
usage: psQTL_prep.py [-h] [-v] {initialise,init,depth,call,view} ...
You should also test the other two psQTL scripts to check that they show help information:
python psQTL_proc.py -h
python psQTL_post.py -h
Note 1: you may see the warning message UserWarning: pkg_resources is deprecated as an API.; other than being a slight nuisance, it is not a problem with psQTL and will not affect its operations.
Note 2: when obtaining the software using option 2, you must invoke the psQTL scripts using python as e.g., python /location/of/psQTL/psQTL_prep.py rather than psQTL_prep as you would do for option 1. Aside from the difference in program invocation, everything else behaves identically.
Anything else?
If you see some other message when asking for program help information, make an issue in this repository and I will try to help figure out what might be going wrong.
Otherwise, you are now ready to use psQTL! Feel free to start on that immediately, or go to the Using psQTL page if you need some guidance on how to do this.