Fall - uic-ric/workshop-data GitHub Wiki
General Information
Workshop days
- Introduction to R: Learn the R studio interface and programming language, including data types, basic programming, reading and writing data, installing packages, basic statistics, and data visualization
- Advanced Topics in R: Topics covered include data manipulation and programming; plotting with ggplot; and additional statistical tests, including summary statistics, parametric/non-parametric tests, and multiple regression.
- Advanced Statistics in edgeR: Learn how to run pairwise, multi-group,and multi-factor statistics (e.g. one-way or two-way ANOVA, repeated measures), and batch effect corrections, in edgeR. Practice visualizing results in PCA plots and heatmaps. Applicable to RNA-seq, ChIP-seq, ATAC-seq, bisulfite-seq, and metagenomics.
- Single Cell RNA-seq: Single cell transcriptome quantification, quality control, normalization, and clustering. Statistical tests for differential expression and composition. Evaluating the robustness of results, and putative cell type identification. Integration of scRNA-seq with VDJ or feature profiling data, as well as trajectory analysis, will also be covered.
- Pathway Analysis: Review and discussion of databases, statistical methods, and tools available for pathway analysis, including Gene Ontology, KEGG, MSigDB/GSEA, and Ingenuity Pathway Analysis (IPA). Network analysis and visualizations discussed as well.
Prerequisites
"Advanced Topics in R", "Advanced Statistics in edgeR", "Advanced Statistics in edgeR", "Single Cell RNA-seq" and "Pathway Analysis" workshop days assume knowledge and skills that will be covered in the Introduction to R workshop. Registration for the pre-requisite day is strongly recommended unless the attendee is confident in their abilities, or has attended an RIC Introduction to R workshop in the past.
If an attendee opts out of prerequisite workshop day(s), they assume full responsibility for the level of skill required for the later workshop days. No refunds, discounts, or supplementary lessons will be given.
Venue(s)
-
IN PERSON. Molecular Biology Research Building (MBRB) at 900 S. Ashland Ave., Rm 1152. NOTE: Light refreshments and lunch will be provided to in-person participants. Note for non-UIC registrants: The closest public parking lot is the Paulina Street garage, 915 S Paulina. The Pink Line Polk stop is also nearby.
-
ONLINE. Details, i.e. URL and access codes, for the online workshop, will be provided approximately 1 week before each workshop day.
Food: Light refreshments and lunch will be provided to in-person participants. We recommended coming in at least 15 minutes early. Refreshments will be served starting at 8:30 AM. No food and drinks are allowed in the instruction room - so please come early.
Time: Sessions begin promptly at 9 a.m. and end at 4:45 p.m.
Zoom Registration: For ONLINE participants only. You will have to register for each workshop separately. Zoom registration details will be sent to all the attendees prior to the workshop. Also, be sure to read our Tips for Online workshops.
If you have any questions please contact us at [email protected].
Desktop/Laptop Requirements
Laptop/desktop computer with at least 4GB RAM. Windows, Mac, or Linux, and high-speed Internet connection. We recommend the following minimum versions for each operating system.
- Windows 10 or higher
- macOS 11 or higher is recommended. However, 10.13 or higher is REQUIRED for the workshops.
- Linux
- Ubuntu 20 or higher
- Redhat 7 or higher
- Fedora 36 or higher
- OpenSUSE 15 or higher
External display (OPTIONAL and ONLINE Users ONLY)
If you have access to an external display, we recommend that you setup your laptop with an external display and extend your desktop so that you can have separate screens for the online workshop and your work, i.e. RStudio or SSH client. Please note. If you do NOT have an external display, you will still be able to fully participate in the workshop. Having an external display is NOT a requirement for the workshop. The following URLs have instructions for setting up a dual display for different operating systems.
- Window 10: https://support.microsoft.com/en-us/help/4340331/windows-10-set-up-dual-monitors
- macOS: https://support.apple.com/guide/mac-help/use-multiple-displays-mchl7c7ebe08/mac
Things to install prior to the workshop
Please complete the following items before the workshop days you will be attending.
Introduction to R
Install R on your laptop
The following are links to installation instructions for R. We will be using v4 for the workshop. Please use the link that corresponds to the operating system on your laptop.
- Windows: https://cran.rstudio.com/bin/windows/base/
- MacOS: https://cran.rstudio.com/bin/macosx/
- Note for Mac M1 users: we still recommend installing the Intel build of R, which will run fine through Rosetta. You will likely have more success installing packages with this build.
- We recommend macOS 11 or higher. However, if you are unable to upgrade macOS v11 and are running macOS 10.13 or higher you should install R-4.2.3.
- Linux: It is recommended to install via the package management system for your OS. If you have any questions, send an email to [email protected].
- Debian/Ubuntu/Mint/derivatives:
sudo apt-get install r-base
- RedHat/CentOS/derivatives:
sudo yum install R
- Debian/Ubuntu/Mint/derivatives:
If you have an older version of R installed on your laptop please update to v4. Instructions on updating R can be found at https://uic-ric.github.io/workshop/updateR.html
Install RStudio
Once you have R installed, please install RStudio. The following link has installation instructions and links to the installers for different operating systems.
For macOS users, we recommend macOS 11 or higher. However, if you are unable to upgrade your macOS system and are running macOS 10.13 or higher you can install RStudio using the installer available at https://dailies.rstudio.com/rstudio/spotted-wakerobin/desktop/macos/2022-07-2-576/
Install Rtools
This is a set of tools that allows R to compile certain packages. Please use the link that corresponds to the operating system on your laptop.
- Windows: https://cran.rstudio.com/bin/windows/Rtools/
- macOS (install Xcode and GNU Fortran compiler): https://mac.r-project.org/tools/
- Linux: these tools were likely installed by your package manager when you installed R.
XQuartz (Mac users only)
This installs the XQuartz terminal for Mac, which is a necessary external dependency for Cairo, which in turn is used by the ComplexHeatmap package to draw heatmaps.
Advanced Topics in R
Install R, Rstudio, and Rtools, instruction can be found under Introduction to R.
In addition to the above requirements, we also recommend that you install the R packages "ggplot2", "doBy", "tidyr", "readxl", and "openxlsx" prior to the workshop day. If you have already installed these packages, there is no need to install them again.
To install these packages, open an RStudio session, and in the Console window, in the bottom left of RStudio, type the following commands. If the installation process asks you to Update all/some/none [a/s/n]:
respond with n
.
install.packages('ggplot2')
install.packages('doBy')
install.packages('tidyr')
install.packages('readxl')
install.packages('openxlsx')
After installation, try to load each package to confirm that the installation was successful.
library(ggplot2)
library(doBy)
library(tidyr)
library(readxl)
library(openxlsx)
Advanced Statistics in edgeR
Install R, Rstudio, and Rtools. Instructions can be found under Introduction to R
In addition to the above requirements, we also recommend that you install the R packages “edgeR”, “ComplexHeatmap”, "circlize", "biomaRt", and "ggplot2" prior to the workshop day. If you have already installed these packages, there is no need to install them again.
To install these packages, open an RStudio session, and in the Console window, in the bottom left of RStudio, type the following commands. If the installation process asks you to Update all/some/none [a/s/n]:
respond with n
.
- CRAN packages
install.packages('ggplot2')
- Install Bioconductor. If you have not already done so, install BiocManager for Bioconductor tools
if ( ! requireNamespace("BiocManager", quietly = TRUE) )
install.packages("BiocManager")
- Bioconductor packages. Install edgeR, ComplexHeatmap, and biomartRT from Bioconductor.
BiocManager::install("edgeR", update=F)
BiocManager::install("ComplexHeatmap", update=F)
BiocManager::install("circlize", update=F)
BiocManager::install("biomaRt", update=F)
- Check installations. Try to load each package to confirm that the installation was successful:
library(edgeR)
library(ComplexHeatmap)
library(circlize)
library(biomaRt)
library(ggplot2)
Single Cell RNA-seq
Install R, Rstudio, and Rtools. Instructions can be found under Introduction to R
In addition to the above requirements, we also recommend that you install the following R packages: "Seurat", "Matrix", "dplyr", "fossil", "ComplexHeatmap", "ggplot2", and "cowplot". Additional packages "monocle" and "CytoTRACE" are optional, but we have included take-home exercises that use these packages. If you have already installed these packages, there is no need to install them again.
- If you have not already done so, install BiocManager for Bioconductor tools
if ( ! requireNamespace("BiocManager", quietly = TRUE) )
install.packages("BiocManager")
- Seurat. Seurat requires the multtest package from Bioconductor, but it is installed from CRAN:
BiocManager::install('multtest', update=F)
install.packages('Seurat')
- Install Matrix, dplyr, and fossil from CRAN
install.packages('Matrix')
install.packages('dplyr')
install.packages('fossil')
install.packages('ggplot2')
install.packages('cowplot')
- Install ComplexHeatmap and monocle from Bioconductor. Note that monocle is optional for the workshop, but used in a take-home exercise.
BiocManager::install("ComplexHeatmap", update=F)
BiocManager::install("monocle", update=F)
- Install CytoTRACE, which is from a custom repository. This is optional, but we include a take-home exercise using CytoTRACE if you would like to try it:
- Download the CytoTRACE installation source from https://cytotrace.stanford.edu/. For the latest version go to the CytoTRACE website and then click on Install R Package in the sidebar menu.
- Execute the following commands in R. Modify the PATH/TO/DOWNLOAD to reflect the directory where you download the CytoTRACE installation package. You may need to install the
devtools
R package also.
if ( ! requireNamespace("devtools"))
install.packages("devtools")
devtools::install_local("PATH/TO/DOWNLOAD/CytoTRACE_0.3.3.tar.gz")
NOTE: If CytoTRACE could not be installed because ERROR: dependency 'sva' is not available for package 'CytoTRACE'
, install “sva” using BiocManager.
BiocManager::install("sva", update=F)
Note that when you load the CytoTRACE library there may be additional warnings that Python packages scanoramaCT
and numpy
are required for analyzing across multiple batches. For our purposes we do not need this functionality and these warnings can be ignored. Refer to the CytoTRACE documentation for more details (https://cytotrace.stanford.edu/).
- Optionally, install
presto
, which allows Seurat to run Wilcox tests much more rapidly. You can skip this step, in which case running differential expression statistics will just take longer.
devtools::install_github('immunogenomics/presto')
NOTE: if you see Bad credentials
errors relating to a GitHub PAT credential, you can remove that by running gitcreds::gitcreds_delete()
in R (confirming that you want to delete the credentials).
- Try to load each package to confirm that the installation was successful.
library(Seurat)
library(Matrix)
library(dplyr)
library(fossil)
library(ComplexHeatmap)
library(monocle)
library(ggplot2)
library(cowplot)
library(CytoTRACE)
# if you installed presto:
library(presto)
Pathway Analysis
Install R, RStudio, Rtools and important R packages
Install R, Rstudio, and Rtools. Instructions can be found under Introduction to R
In addition to the above requirements, we also recommend that you install the following R packages: "ggplot2", "circlize", "tidyr", and "ComplexHeatmap". If you have already installed these packages, there is no need to install them again.
- If you have not already done so, install BiocManager for Bioconductor tools
if ( ! requireNamespace("BiocManager", quietly = TRUE) )
install.packages("BiocManager")
- Install ggplot2, circlize, and tidyr from CRAN.
install.packages('ggplot2')
install.packages('circlize')
install.packages('tidyr')
install.packages('reshape2')
- Install ComplexHeatmapfrom Bioconductor:
BiocManager::install("ComplexHeatmap", update=F)
- Try to load each package to confirm that the installation was successful:
library(ggplot2)
library(circlize)
library(tidyr)
library(ComplexHeatmap)
library(reshape2)
GenePattern registration
- Open a web browser and navigate to https://cloud.genepattern.org/gp/pages/registerUser.jsf
- Fill out the registration form
- Enter a username that you will use to login
- Create a password for your account
- Fill in your email address
- You can unselect Add me to the GenePattern users mailing list to opt-out of the email list.
- Click “I’m not a robot”
- Click Create My Account
The GenePattern will then create your account and will open the GenePattern web application. There is not additional steps you need to take and you can close your web browser.
MSigDB registration
- Open a web browser and navigate to https://www.gsea-msigdb.org/gsea/register.jsp
- Fill out the registration form
- Fill in your email address
- Enter your organization, e.g. “University of Illinois at Chicago” or “University of Chicago”
- Click Register
Complimentary IPA Analysis - Workshop
The Research Informatics Core at UIC will be hosting a FREE Introduction to Ingenuity Pathway Analysis workshop, presented by a Qiagen IPA representative, on Wednesday, November 6, from 9 a.m. to 1 p.m. This will provide a deeper look into the functionality of IPA to complement the above pathway analysis workshop. This event is free and will be held in person/virtually. Registration is Required. To register, please visit go.uic.edu/ipaworkshop.
Details: Session 1: 9-11 am: Perform Pathway Analysis on 'Omics Data' Session 2: 11-12 am: Mine IPA’s rich database for novel discoveries Lunch: 12-1 pm: QIAGEN CLC Genomics Workbench Pizza Lunch & Learn Powerful and scalable NGS data analysis for any species, any platform, any workflow
More information about each IPA Workshop Session can be found on the registration page go.uic.edu/ipaworkshop