Installation and dependencies - agmcfarland/GeneGrouper GitHub Wiki

Simple install

Note: The simple install assumes you already have the required dependencies or will install them as needed. Please review that all dependencies meet the correct version numbers.

pip install GeneGrouper

Creating a conda environment with all dependencies and GeneGrouper

This is the recommended way to install and run GeneGrouper.

These instructions will install all dependencies and GeneGrouper in a single conda environment.

Why use a conda environment?

Installing Python and bioinformatic dependencies for grouping

conda create -n GeneGrouper_env python=3.9

source activate GeneGrouper_env #or try: conda activate GeneGrouper_env

conda config --add channels defaults

conda config --add channels bioconda

conda config --add channels conda-forge

pip install biopython scipy scikit-learn pandas matplotlib GeneGrouper

conda install -c bioconda mcl blast mmseqs2 fasttree mafft

Installing R and required packages for visualizations

conda install -c conda-forge r-base=4.1.1 r-svglite r-reshape r-ggplot2 r-cowplot r-dplyr r-gggenes r-ape r-phytools r-BiocManager r-codetools

# enter R environment
R

# install additional packages from CRAN
install.packages('groupdata2',repos='https://cloud.r-project.org/', quiet=TRUE)

# install additional packages from 
BiocManager::install("ggtree")

# quit
q(save="no")

Installing R packages separately

If you'd prefer to use your own R installation make sure to install the following packages. And make sure your R version is 4.0 or greater!

packages <- c("reshape", "ggplot2", "cowplot", "dplyr", "gggenes", "groupdata2", "svglite", "BiocManager", "ape", "phytools")

install.packages(setdiff(packages, rownames(installed.packages())),repos='https://cloud.r-project.org/', quiet=TRUE) 

BiocManager::install("ggtree")

After installation

Try out GeneGrouper by following along an example workflow with some provided data!

Download genomes to get started

Check out basic usage workflows

Requirements and dependencies

Note: Make sure package versions meet or exceed the versions listed below:

Python>=3.6

biopython=1.79

scipy=1.7.1

scikit-learn=1.0

pandas=1.3.3

matplotlib=3.4.3

mmseqs2=13.45111

mcl=14.137

blast=2.10.1

fasttree=2.1.10

mafft=7.487

R>=4.0.0 (for visualizations)

gggenes=‘0.4.0’

reshape=‘0.8.8’

ggplot2=‘3.3.5’

cowplot=‘1.0.0’

dplyr=‘1.0.0’

groupdata2

ggtree=‘2.2.1’

ape=‘5.4’

phytools=‘0.7.47’

bioconductor>="3.11"