Home - mbassalbioinformatics/SLICER GitHub Wiki
Welcome to the SLICER Wiki!
SLICER (Sequencing Long-read Identifier of Complex Element Regions) is a bioinformatics pipeline for analyzing complex engineered DNA constructs from PacBio long-read sequencing data. This Wiki provides detailed documentation to help you install, configure, and run SLICER, as well as understand its outputs.
Navigation
- Installation: Step-by-step installation guide.
- Input Files and Configuration: Understanding input requirements and the configuration file.
- SLICER Workflow: A detailed look at how SLICER processes data.
- Running SLICER: How to execute SLICER.
- De Novo Reference Prediction: Understanding the "Slope" and "Distance" methods.
- Output Files: Explanation of the results generated by SLICER.
- Troubleshooting and FAQ: Common issues and questions.
- Tutorials: Example use cases.
Quick Overview
SLICER is designed to:
- Process PacBio/ONT long-read sequencing data (uBAM/fq.gz).
- Dynamically identify and extract key elements (barcodes, core sequences) from engineered DNA constructs using user-defined anchor sequences.
- Perform robust demultiplexing of pooled libraries.
- Generate reference sequences de novo when prior references are unavailable.
- Provide comprehensive quantification and quality reports.
It is particularly useful for analyzing designs/constructs from Golden Gate assemblies, plasmid libraries, and CRISPR-related sequencing experiments.
Assumed Read Structure
SLICER assumes your sequenced constructs generally follow this structure:
[Left Backbone] --- LFS --- Barcode --- RFS --- Core Sequence --- [Right Backbone]
OR
[Left Backbone] --- LFS --- Core Sequence --- RFS --- Barcode --- [Right Backbone]
Where:
- LFS: Left Flanking Sequence (immediately upstream of the barcode).
- RFS: Right Flanking Sequence (between the barcode and the core sequence).
- Core Sequence: The main insert or region of interest downstream of the barcode and RFS.
- Left/Right Backbone: Plasmid backbone sequences flanking the entire insert.
SLICER uses short, user-defined "anchor" motifs to identify these regions:
LFS_end
: The lastslen
bases of the LFS.RFS_start
: The firstslen
bases of the RFS.RFS_end
: The lastslen
bases of the RFS.RBS_start
: The firstslen
bases of the Right Backbone Sequence (immediately downstream of the core sequence).
Accurate definition of these anchor sequences and the slen
parameter is crucial for SLICER's performance. See Input Files and Configuration for details.