Import and Manage SMRT Cell Data to SMRT Portal - dyim42/SMRT-Analysis GitHub Wiki
###Primary Analysis Overview###
Once sequencing is initiated, the systemβs computational blade center performs real-time signal processing, base calling and quality assessment. Primary analysis data, including read length, distribution, polymerase speed and quality measurement are streamed directly to the secondary analysis software. This data, as well as trace and pulse data, are also available through the RS Touch and RS Remote interfaces for quick assessment of a sequenced SMRT Cell.
###What files are transferred to secondary storage from primary analysis on the RSII blade server?###
Below is a typical directory hierarchy of files transferred from the primary analysis blade server to secondary storage server:
/path/to/secondary/storage/2420294/0011
βββ Analysis_Results
β βββ m140415_143853_42175_c100635972550000001823121909121417_s1_p0.1.bax.h5
β βββ m140415_143853_42175_c100635972550000001823121909121417_s1_p0.1.log
β βββ m140415_143853_42175_c100635972550000001823121909121417_s1_p0.1.subreads.fasta
β βββ m140415_143853_42175_c100635972550000001823121909121417_s1_p0.1.subreads.fastq
β βββ m140415_143853_42175_c100635972550000001823121909121417_s1_p0.2.bax.h5
β βββ m140415_143853_42175_c100635972550000001823121909121417_s1_p0.2.log
β βββ m140415_143853_42175_c100635972550000001823121909121417_s1_p0.2.subreads.fasta
β βββ m140415_143853_42175_c100635972550000001823121909121417_s1_p0.2.subreads.fastq
β βββ m140415_143853_42175_c100635972550000001823121909121417_s1_p0.3.bax.h5
β βββ m140415_143853_42175_c100635972550000001823121909121417_s1_p0.3.log
β βββ m140415_143853_42175_c100635972550000001823121909121417_s1_p0.3.subreads.fasta
β βββ m140415_143853_42175_c100635972550000001823121909121417_s1_p0.3.subreads.fastq
β βββ m140415_143853_42175_c100635972550000001823121909121417_s1_p0.bas.h5
β βββ m140415_143853_42175_c100635972550000001823121909121417_s1_p0.sts.csv
β βββ m140415_143853_42175_c100635972550000001823121909121417_s1_p0.sts.xml
βββ m140415_143853_42175_c100635972550000001823121909121417_s1_p0.1.xfer.xml
βββ m140415_143853_42175_c100635972550000001823121909121417_s1_p0.2.xfer.xml
βββ m140415_143853_42175_c100635972550000001823121909121417_s1_p0.3.xfer.xml
βββ m140415_143853_42175_c100635972550000001823121909121417_s1_p0.mcd.h5
βββ m140415_143853_42175_c100635972550000001823121909121417_s1_p0.metadata.xml
1 directory, 20 files
###What files are required for importing SMRT Cells into SMRT Portal?###
To import SMRT Cells into SMRT Portal, the above directory structure must be preserved. The minimum requirement for SMRT Cells to be recognized by SMRT Portal is the *.metadata.xml file and all *.bax.h5 and *.bas.h5 files. The bax.h5 files contain base call information from the sequencing run, and the bas.h5 file is essentially a pointer to the three bax.h5 files. The *.metadata.xml contains top level information about the data, including what sequencing enzyme and chemistry were used, sample name, and other metadata. The *.mcd.h5 file is not strictly required.
###SMRT Pipe Job Directory Hierarchy###
SMRT Pipe job output directories all have a basic top-level view.
$SMRT_ROOT/userdata/jobs/<JOB_PREFIX>/<JOB_ID>/
βββ data/
βββ log/
βββ movie_metadata/
βββ reference/
βββ results/
βββ workflow/
βββ index.html
βββ input.fofn
βββ input.xml
βββ job.sh
βββ metadata.rdf
βββ settings.xml
βββ vis.jnlp
For more detail on specific protocol outputs, see Navigating the SMRT Pipe Job Directory
For more information on File Format Specifications, visit [PacBio DevNet] (http://www.pacbiodevnet.com).