Factor_Level_Summary.R Tutorial - sciencesharon/SequencingAndMore GitHub Wiki

Factor_Level_Summary.R Tutorial

Demographic Summary Tables

Ever wanted to make a really nice demographic summary table for publication?

One that perhaps looks like this:

This is the script for you!

  • It takes the input of metadata with samples in rows and character data with factor levels in columns.

  • This takes character factor levels and returns counts (number of samples) of each factor level in each input column per group.

  • It can be used to compare control and experimental groups and their factor level characteristics.

  • The output also determines significant differences between input groups by Fischer's Exact Test and returns p_values.

  • It outputs directly to a .csv file

Example Input Files


How to use the script

Step 1: Read the metadata into R

metadata <- read.csv("/path/to/your/file/Example_Metadata.csv")

Step 2: Convert your character columns to factors

character_cols <- colnames(metadata)[1:5]

metadata <- metadata %>% mutate(across(all_of(character_cols), as.factor))

Step 3: Separate your groups you would like to compare

Cherry <- metadata[metadata$Fruit %in% c("Cherry"),]

Banana <- metadata[metadata$Fruit %in% c("Banana"),]

Apple <- metadata[metadata$Fruit %in% c("Apple"),]

Date <- metadata[metadata$Fruit %in% c("Date"),]

Elderberry <- metadata[metadata$Fruit %in% c("Elderberry"),]

Step 4: List your groups and the names

datasets <- list(Cherry, Banana, Apple, Date, Elderberry)

dataset_names <- c("Cherry", "Banana", "Apple", "Date", "Elderberry")

Step 5: Set the columns to compare

columns <- colnames(metadata)[1:5]

Step 5: Set file destination and file name

file_path <- ("/path/to/summary.csv")

Step 6: Run the function

summary <- factor_levels_summaries(datasets = datasets, dataset_names = dataset_names, columns = columns, file_path = file_path)

Example Output


You can easily format the output table to look pretty like the image above in Excel or LibreOffice: