February 2022 - Bozhie/transcription-modeling GitHub Wiki

Running Questions for Elphege et al

  • are these technical or biological replicates? --> biological (found in paper)
  • assigning TSS to genes when using gene aggregation in DESeq2 to detect DEGs

2.23.2022

Finished summary of heatmaps:

summary of different workflows

2.23.2022

Finished

choosing "best" TSS for aggregated genes / DESeq2 TSS mapping

  • ensembl suggests using tags http://mart.ensembl.org/info/genome/genebuild/transcript_quality_tags.html
    • these tags are not downloadable using biomaRt,
    • actually didn't notice any of gene in the mouse gene set with the "canonical" tag, so would have to rely on the other ones
  • from page describing how the canonical transcript is assigned "For everything, if required, the final disambiguation step is the lowest stable ID number (i.e. the oldest)."
    • spot-checking to compare the tags for mouse genes with duplicated transcripts, it seems like this method is okay, unless we want to 1. figure out perl API (which I'm not sure it would work) or 2. consider
    • for the genes with a lot of isoforms, the tags usually have multiple (2-3) possible dominant transcripts, so doesn't specify the "best". At this resolution, choosing one of these three seems okay.

=====

Notes to fill in

  • add link to google doc
  • how do I run kallisto/sleuth

resources to add:

looking forward

  • add notes from command-line scripts / using snakemake
    • how do I get the standard deviation of fragment length?