What does each branch & file contain? - gy315-K/REAL_FORKED_abT-Tact-cells-Team2 GitHub Wiki

Main branch

  • ATAC-seq data.csv : raw ATAC-seq data
  • mmc2.csv : raw RNA-seq data
  • filtered_abT_Tact_Stem.csv : selected abT and Tact cells for RNA-seq
  • refFlat : dataset strand information to determine TSS
  • mmc1-QC.xlsx : original QC dataset

ATAC-seq wrangling branch

  • mmc1-QC.xlsx : original QC dataset
  • folder : ATAC-seq
    • ATAC-seq data.csv : again row ATAC-seq data, don't want to risk deleting it
    • filtered_ATAC_abT_Tact_Stem.csv : raw ATAC-seq data only for the abT and Tact cells
    • refined_ATAC.csv : modified table of selected abT and Tact cells from ATAC-seq data to show peaks_IDs and cell type connection in two columns
  • folder : Descriptive_Stat_ATAC
    • Stat_ATAC.ipynb : Basic descriptive statistics - examining global chromatin signal in different cell types
  • folder : QC_signal
    • SortedPopulations_abT-Tact.csv : 1st out of 2 spreadsheets from the QC dataset (mmc1) filtered for abT and T.act cells.
    • ReadStatistics_abT-Tact.csv : 2nd out of 2 spreadsheets from the QC dataset (mmc1) filtered for abT and T.act cells.

tss_distance_analysis branch

Goal: Compute distance from peak centers (ATAC-seq) to closest TSS (from RefFlat) and assess signal behavior across that distance. Also evaluate correlation and trends between signal strength and proximity to TSS.

Files Used:

refFlat : Gene annotations with strand information to determine TSS

ATAC-seq/refined_ATAC.csv : ATAC-seq signal data (with peakID, signal, and summit)

Notebooks:

tss_distance.ipynb