juicer介绍

juicer是一款非常实用的Hi-C软件，通过简单的设置参数，就能处理巨大的Hi-C数据，这款软件就涵盖了一下功能:

直接将原始数据处理成指定精度的Hi-C交互数据
使用juicer-tools等工具，鉴定TAD、染色质loop等

1.安装

1.1依赖项目

GNU核心命令cat啥的，基本上centos系统就行
BWA 用于序列比对
java 1.7 or java 1.8
juicer Tools jar 进行下游的TAD鉴定、染色质loop识别时需要用到GPU计算
CUDA 并行计算GPU
软件包自带CUDA.7的编译库，也可以从这下载JCuda
当然为了获得最好的计算性能，建议使用高性能的GPU集群进行计算

1.2特定的集群

目前juicer支持以下几种集群，在进行分析时需要使用juicer包内对应的scripts

OpenLava
LSF
SLURM
GridEngine (Univa, etc. any flavor)

1.3目录结构

scripts/ 放置juicer-Tools
reference/ 存放参考基因组文件和BWA索引文件
restriction_sites/ 限制性酶切等文件，没有的话，跑的时候用-s none参数
sample/fastq/ 测序数据文件

2.测试

# 克隆仓库
git clone [email protected]:aidenlab/juicer.git --depth=1

以后就在~/HiCSoftware/juicer这个目录下跑juicer

cd home
mkdir -p HiCSoftware/juicer
cd HiCSoftware/juicer
##构造目录结构和下载数据
mkdir references; cd references
wget https://s3.amazonaws.com/juicerawsmirror/opt/juicer/references/Homo_sapiens_assembly19.fasta
wget https://s3.amazonaws.com/juicerawsmirror/opt/juicer/references/Homo_sapiens_assembly19.fasta.amb
wget https://s3.amazonaws.com/juicerawsmirror/opt/juicer/references/Homo_sapiens_assembly19.fasta.ann
wget https://s3.amazonaws.com/juicerawsmirror/opt/juicer/references/Homo_sapiens_assembly19.fasta.bwt
wget https://s3.amazonaws.com/juicerawsmirror/opt/juicer/references/Homo_sapiens_assembly19.fasta.pac
wget https://s3.amazonaws.com/juicerawsmirror/opt/juicer/references/Homo_sapiens_assembly19.fasta.sa
## 下载酶切数据
mkdir ../restriction_sites; cd ../restriction_sites
wget https://s3.amazonaws.com/juicerawsmirror/opt/juicer/restriction_sites/hg19_MboI.txt

## 建立对应集群版本的脚本软连接
cd ../
ln -s ~/github/juicer/LSF/scripts/ scripts
cd scripts
wget https://hicfiles.tc4ga.com/public/juicer/juicer_tools.1.9.9_jcuda.0.8.jar
ln -s 绝对路径/juicer_tools.1.7.6_jcuda.0.8.jar juicer_tools.jar
cd ..

## 创建样品目录和测序数据目录
mkdir HIC003; cd HIC003
mkdir fastq; cd fastq
wget http://juicerawsmirror.s3.amazonaws.com/opt/juicer/work/HIC003/fastq/HIC003_S2_L001_R1_001.fastq.gz
wget http://juicerawsmirror.s3.amazonaws.com/opt/juicer/work/HIC003/fastq/HIC003_S2_L001_R2_001.fastq.gz
cd .. ##当前位于样品目录
## 运行测试数据、一定要使用绝对路径
~/HiCSoftware/juicer/scripts/juicer.sh -D ~/HiCSoftware/juicer

3.集群版本参数说明

-p 染色体长度文件，绝对路径
-z 基因组fa文件，绝对路径，bwa索引需要和fasta文件在同一文件夹
-s 酶切类型 "HindIII" or 默认为 "MboI",'none
-d 指定样本目录，fastq文件夹需要在目录下，最终会生成aligned文件
-t 指定线程数
-C 并行运算，拆分测序数据时，每份大小，默认90000000，必须是4的倍数
-D 设置工作目录，里面需要包含scripts/ references/ and restriction_sites/ 这些文件夹
-q 设置比对时的队列，队列占用时间比较短
-L 设置处理hic 文件时，长时间占用的队列
-S 分阶段的跑
- "merge"
- "dedup"
- "final"
- "postproc"
- "early"

./scripts/juicer.sh -d /public/home/zpliu/HiCSoftware/juicer/test  -z /public/home/zpliu/HiCSoftware/juicer/references/hg19.fa -p /public/home/zpliu/HiCSoftware/juicer/chromsome.bed  -s none   -D /public/home/zpliu/HiCSoftware/juicer/ -q  q2680v2  -L q2680v2

4.报错

ModuleCmd_Load.c(213):ERROR:105: Unable to locate a modulefile for 'seq/bwa/0.7.8' 修改对应的脚本，与集群中bwa的版本对应即可

## 修改 script脚本中74行
load_bwa="module load seq/bwa/0.7.8"
load_bwa="module load BWA/0.7.17"

在脚本中module load 其他软件的时候同样检查一下

load_java="module load dev/java/jdk1.7"
load_cuda="module load dev/cuda/7.0.28"

5.cpu版本

适用于小样本数据

将整个CPU目录建立为scripts软连接,类似集群版本的操作 `

./scripts/juicer.sh -d /public/home/zpliu/HiCSoftware/juicer/test  -z /public/home/zpliu/HiCSoftware/juicer/references/hg19.fa -p /public/home/zpliu/HiCSoftware/juicer/chromsome.bed  -s none   -D /public/home/zpliu/HiCSoftware/juicer/ -t 5

6.输出结果

在 sample/aligned目录下生成.hic文件

对于中间文件可以使用cleanup.sh脚本进行删除

.
    ├── abnormal.sam
    ├── collisions_dups.txt
    ├── collisions_nodups.txt
    ├── collisions.txt
    ├── dups.txt
    ├── header
    ├── inter_30_contact_domains
    ├── inter_30.hic
    ├── inter_30_hists.m
    ├── inter_30.txt
    ├── inter.hic
    ├── inter_hists.m
    ├── inter.txt
    ├── merged_nodups.txt
    ├── merged_sort.txt
    ├── opt_dups.txt
    └── unmapped.sam

install juicer - BiocottonHub/BioSoftware GitHub Wiki

juicer介绍

1.安装

1.1依赖项目

1.2特定的集群

1.3目录结构

2.测试

2.1创建工作目录、构造，目录结构

3.集群版本参数说明

4.报错

5.cpu版本

6.输出结果

⚠️ GitHub.com Fallback ⚠️

install juicer - BiocottonHub/BioSoftware GitHub Wiki

juicer介绍

1.安装

1.1依赖项目

1.2特定的集群

1.3目录结构

2.测试

2.1创建工作目录、构造，目录结构

3.集群版本参数说明

4.报错

5.cpu版本

6.输出结果

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️