FAQ - loosolab/TOBIAS GitHub Wiki
FAQ (Frequently Asked Questions)
1. What is the recommended workflow for the TOBIAS tools?
The TOBIAS tools are intended to be run in the order of:
- ATACorrect
- ScoreBigwig
- BINDetect
You can also check out the pre-set snakemake pipeline in order to automate analysis for many conditions.
2. What .bam-file should I use as input?
You can use any .bam-file containing ATAC-seq reads. It is recommended that you remove PCR duplicates, as these can otherwise influence footprinting. You, however, do not need to shift the reads +4/-5, as TOBIAS ATACorrect does this internally.
3. What peak-file should I use as input?
You should use any .bed-file containing open chromatin regions from peak-calling, e.g. from MACS2 or similar. If you are planning to compare several conditions with each other, e.g. 'WT.bam' with 'treatment.bam', you should obtain the peaks 'WT_peaks.bed' and 'treatment_peaks.bed' for each condition, and merge these using e.g. bedtools:
cat WT_peaks.bed treatment_peaks.bed | bedtools sort | bedtools merge > merged_peaks.bed
You should then use 'merged_peaks.bed' throughout the TOBIAS tools.
4. How do I deal with replicates?
TOBIAS does not deal with individual biological replicates, and it is therefore recommended to merge replicate .bam-files prior to correction and footprinting. This is to improve the sequencing depth, as well as simplify the downstream interpretation of the results.
5. I get an error stating "too many files open" - what do I do?
TOBIAS BINDetect and other tools require files per motif/TF to be open throughout the run, and this can exceed the system-limit for open file handles. This can be fixed using the command ulimit:
$ ulimit -n 3000 (or the amount needed - this scales with the number of motifs used)
6. I have another question not answered here...
Please search the issues page to see if someone has already asked a similar question. Otherwise, please feel free to open a new issue.