Entry 6: Rmd & Finding a Dataset (Week 3, 4 lecture notes) - bcb420-2025/Izumi_Ando GitHub Wiki
R Markdown Good Practices
1 - Table of Contents
- if you set
toc
totrue
in your yaml header, it will generate it for you by picking up on the # headers
# example yaml header from lecture3_Rmarkdown_tips.pdf
---
title: "Something Fun"
output:
html_document:
toc: true
toc_depth: 2
bibliography: my_bibliography.bib
csl: biomed-central.csl
---
2 - Code Chunk Specs
- make sure you control your code output. you do not want certain messages printing for the viewer, some things are just for you. (ex: debugging print statements)
- you can see options and defaults by running
str(knitr::opts_chunk$get())
(taken from cheat sheet linked above), there is also a table on the cheatsheet as well
3 - Bibtex
- doesn't have to be this but is a useful tool
How to use Bibtex
- Step 1: create a bib file.
- Step 2: add the bib file name and citation style (predifined styles here) into yaml header. not sure if you add the file to the workspace or you just refer to it. will figure out.
- Step 3: add Bibtex citation to your
bib
file. - Step 4: add citation by adding
[@tag_for_publication]
to your text, and the in-text citation and references list will be generated.
Types of Expression Data
- main ones: microarray, bulk RNAseq, single cell RNAseq
- microarray uses chips with oligonucleotide probes
- single cell is specialized version of bulk RNA seq
Main focus of this course is bulk RNA seq
- bulk RNAseq types: short / long read, direct read but majority is short read illumina
- considerations: # of samples, sample prep method, read depth, single or paired reads
- processing: alignment & assembly > quantification > normalization & filtering
- if you are curious, the tools that can be used for each step are listed in the slides
Slide from lecture 3