3. Biostats research related SCC knowledge - bu-rcs/SA-Biostatistics GitHub Wiki

SCC Introductory Tutorial

RCS holds an introductory tutorial to new biostats student every fall.

Slides for tutorial on 09/13/2021

Please email [email protected] to request for tutorial recording.

Resource for research work

The project cbs is specifically used for research work of biostats students. Here is the buy-in node information:

Queue Compute nodes Hardware
bstats-16 scc-mf1, scc-mf2 16 core, 256 GB RAM, Intel Ivybridge CPU
bstats-28 scc-tn4 28 core, 256 GB RAM, Intel Broadwell CPU
bstats-32 scc-zp4 32 core, 192 GB RAM, Intel Cascade Lake CPU

Please include the following lines in your qsub file if you want to use the above resource:

# Specify project name
#$ -P cbs

# Specify which queue to use
#$ -q bstats-16

Running GWAS using the FHS pipeline

https://sites.bu.edu/fhspl/

The Framingham Analytical Pipeline is designed to allow investigators to efficiently perform a Genome Wide Analytical Study (GWAS). The pipeline allows for some flexibility but maintains a standard process to make the results easier to process.

It requires additional approval to get access. Please consult with your supervisor about the relevant documents.

Please load fhspl module before using the FHS pipeline:

module load fhspl

It has various features, not restricted to runnig GWAS:

  • go550k: general command to run GWAS
  • level1 and level2: summarize GWAS results
  • create_subset: quickly extract genotype of certain SNPs
  • seq_meta_wrapper: facilitate the use of the R package seqMeta with genotype sets available in fhspl
  • extract_by_field: search a file by the information in a specific column
  • create_posterior: generate a file containing the MACH genotype posterior probabilities for a subset of imputed SNPs
  • ...

Commonly used statistical genetics packages available on SCC

METAL: a tool for the meta-analysis of genome-wide association studies

module avail metal

EPACTS: a versatile software pipeline to perform various statistical tests for identifying genome-wide association from sequence data through a user-friendly interface

module avail epacts

locuszoom: create regional plots for loci of interest

module avail locuszoom





Note: If you have problem with installing softwares on SCC, check with [email protected] to see if they can install it for you.

⚠️ **GitHub.com Fallback** ⚠️