Neurospora Crassa (Fungus) Genome, Epigenome, and Transcriptome - PacificBiosciences/DevNet GitHub Wiki

Instrument:  PacBio RS II
Chemistry:  C3
Enzyme: P4

Summary

This dataset contains 11X raw and processed (mapped) reads for the the reference OR74A strain of Neurospora Crassa. DNA was obtained from strain #2489 at the [Fungal Genetics Stock Center] (http://www.fgsc.net/).

This dataset also contains raw and processed (assembled, mapped, and annotated) reads for the T1 strain of Neurospora Crassa. The T1 data supports the findings in [this poster] (http://figshare.com/articles/ENCODE_like_study_using_PacBio_sequencing/928630) by Yeadeon and Kim et al. that was presented at the [Advances in Genome Biology and Technology (AGBT) Meeting] (http://www.agbt.org/) in 2014.

Download Dataset

OR74A

Raw Data
  https://s3.amazonaws.com/datasets.pacb.com/2014/Neurospora/OR74A/raw/OR74A_rawdata.tgz

Processed Data
  https://s3.amazonaws.com/datasets.pacb.com/2014/Neurospora/OR74A/reads/OR74A_filtered_subreads.fasta
  https://s3.amazonaws.com/datasets.pacb.com/2014/Neurospora/OR74A/reads/OR74A_filtered_subreads.fastq
  https://s3.amazonaws.com/datasets.pacb.com/2014/Neurospora/OR74A/reads/OR74A_aligned_reads.sam
    
  • OR74A_filtered_subreads.fasta contains 981,884,113 bases from 175,926 subreads where half of the bases are contained in subreads greater than 7,617 nt long.
  • OR74A_filtered_subreads.fastq contains quality values for the bases in the same sequences in the fasta file.
  • OR74A_aligned_reads.sam contains 481,063,652 mapped bases from 113,770 mapped reads where half of the mapped subreads contained in subreads that are longer than 5,414 nt.

T1

Raw Data

Processed Data
  https://s3.amazonaws.com/datasets.pacb.com/2014/Neurospora/T1/reads/T1_filtered_subreads.fa  - Coming soon
  https://s3.amazonaws.com/datasets.pacb.com/2014/Neurospora/T1/reads/T1_preassembled_reads.fa - Coming soon
  https://s3.amazonaws.com/datasets.pacb.com/2014/Neurospora/T1/reads/T1_assembled_genome.fa - Coming soon
  https://s3.amazonaws.com/datasets.pacb.com/2014/Neurospora/T1/reads/T1_modifications.csv.gz
  https://s3.amazonaws.com/datasets.pacb.com/2014/Neurospora/T1/reads/T1_annotated_transcripts.fa
  https://s3.amazonaws.com/datasets.pacb.com/2014/Neurospora/T1/reads/T1_annotated_transcripts.gff
  • T1_filtered_subreads.fa contains 9,201,000,912 bases from 1,228,610 subreads where half of the bases are contained in subreads greater than 10,404 nt long.
  • T1_preassembled_reads.fa contains 738,786,691 high-quality pre-assembled bases from 78,961 pre-assembled reads each at least 15,945 nt long.
  • T1_assembled_genome.fa contains the assembled genome of the T1 strain.
  • T1_modifications.csv.gz contains candidate interpulse duration for all bases in forward and reverse orientation relative the T1 assembled genome.
  • T1_annotated_transcripts.fa contains sequences of annotated transcripts.
  • T1_annotated_transcripts.gff contains mapped locations of annotated transcripts relative the T1 assembled genome.