Importing Many Data Files at Once - meyermicrobiolab/Meyer_Lab_Resources GitHub Wiki

For some projects, you may have hundreds of samples with their own data files. Individually importing each data file into R can be tedious and time consuming. Rather, you can bulk import all of your data files for each sample at once and compile them into a single spreadsheet within R. It should be noted that this only works with data files that are the same data type (all .csv, .tsv, etc.).

First, you want to make sure you have all of the data files in one folder. Each one should include the sample name. For example, sample1_genomad_output.csv, sample2_genomad_output.csv, ...

Then, you want to make sure you have the tidyverse package installed and loaded in R Studio.

install.packages("tidyverse")

library(tidyverse)

Load the data files into R using the following code:

list_of_files <- list.files(path = "~/filepath", recursive = TRUE, pattern = "\\.csv$", full.names = TRUE)

The filepath is the actual pathway to the folder within your computer's hard drive that contains the data files. You can navigate to it using the RStudio file panel on the lower right > click on the 'More' button > select 'Copy folder Path to Clipboard' > paste the folder path into your code.

The 'pattern' attribute designates the type of file to import. This means that every .csv/.tsv data file in the folder will be read in, so make sure the folder only contains the files you're interested in importing.

Create an R dataframe with the data from each file. You will use the read_csv or read_tsv command depending on the type of files you're importing.

df <- readr::read_csv(list_of_files, id = "file_name")

That's it! You've successfully imported all of your data files into a single data frame.