FAQ - loosolab/TOBIAS GitHub Wiki

FAQ (Frequently Asked Questions)

1. What is the recommended workflow for the TOBIAS tools?

The TOBIAS tools are intended to be run in the order of:

ATACorrect
ScoreBigwig
BINDetect

You can also check out the pre-set snakemake pipeline in order to automate analysis for many conditions.

2. What .bam-file should I use as input?

You can use any .bam-file containing ATAC-seq reads. It is recommended that you remove PCR duplicates, as these can otherwise influence footprinting. You, however, do not need to shift the reads +4/-5, as TOBIAS ATACorrect does this internally.

3. What peak-file should I use as input?

You should use any .bed-file containing open chromatin regions from peak-calling, e.g. from MACS2 or similar. If you are planning to compare several conditions with each other, e.g. 'WT.bam' with 'treatment.bam', you should obtain the peaks 'WT_peaks.bed' and 'treatment_peaks.bed' for each condition, and merge these using e.g. bedtools:

cat WT_peaks.bed treatment_peaks.bed | bedtools sort | bedtools merge > merged_peaks.bed

You should then use 'merged_peaks.bed' throughout the TOBIAS tools.

4. How do I deal with replicates?

TOBIAS does not deal with individual biological replicates, and it is therefore recommended to merge replicate .bam-files prior to correction and footprinting. This is to improve the sequencing depth, as well as simplify the downstream interpretation of the results.

5. I get an error stating "too many files open" - what do I do?

TOBIAS BINDetect and other tools require files per motif/TF to be open throughout the run, and this can exceed the system-limit for open file handles. This can be fixed using the command ulimit:

$ ulimit -n 3000 (or the amount needed - this scales with the number of motifs used)

6. I have another question not answered here...

Please search the issues page to see if someone has already asked a similar question. Otherwise, please feel free to open a new issue.