Build BSgenome - Linlab-slu/TSSr GitHub Wiki

Install and check available of BSgenome

BSgenome is a required dependency for TSSr analysis.

You can check here whether the species you want to analyze already has a ready-made BSgenome.

If so, congratulations! You can simply install and load it like this:

# Install BSgenome via BiocManager
if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")
BiocManager::install("BSgenome")

# Install the package you need and take BSgenome.Scerevisiae.UCSC.sacCer3 as an example:
if (!requireNamespace("BSgenome.Scerevisiae.UCSC.sacCer3", quietly = TRUE))
  BiocManager::install("BSgenome.Scerevisiae.UCSC.sacCer3")

# load BSgenome and BSgenome.Scerevisiae.UCSC.sacCer3
library(BSgenome)
library(BSgenome.Scerevisiae.UCSC.sacCer3)

Of course, you can use BSgenome to check the available packages like:

library(BSgenome)
available.genomes()

Build BSgenome by yourselves

Build BSgenome with autoBSgenome

However, most of the time, the species (or strain) you need does not have a ready-made BSgenome. In that case, here's how to build one yourself. Our lab has developed a simple script called autoBSgenome for building BSgenome. It's like filling out a survey - just fill in the necessary information and you're done.

Step 1: clone autoBSgenome

git clone https://github.com/JohnnyChen1113/autoBSgenome.git

Step 2: install dependency with pip or conda:

pip install prompt_toolkit
pip install rich

Of course, you still need BSgenome

if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")
BiocManager::install("BSgenome")

Or you can install all the packages by conda:

conda install -c conda-forge -c bioconda prompt_toolkit rich r-base bioconductor-bsgenome

Step 3: run the script.

Use Python to run the autoBSgenome

cd autoBSgenome
python autoBSgenome.py

This script works with python3, and needs download faToTwoBit (This script will download it automatically) All input data needed is the reference fasta file.