working with large data sets - veeninglab/BactMAP GitHub Wiki

Speed limitations

Microscopy data can get very large. While working through the BactMAP tutorials, you probably noticed the limitations in speed when it comes to larger datasets. Especially functions which deal with (converted) TIFF stacks and import functions converting Matlab data into R dataframes are generally slow.

We are currently working on implementing optional parallelization into our functions to speed up the computation. When your datasets are becoming very large, it is recommended to move your analysis to a server.

Memory limits

The memory limits depend on your computer and the settings in R. In Windows and Linux, can see how much working memory your R is allowed to use using the following command:

memory.limit()

You can use the same command to change the memory limit (in Mb):

memory.limit(size=2500)

Clogged memory

R clears up the things it doesn't use (garbage collection) while it's working. However, there are things slipping through. I noticed that especially in the functions doing many computations on large datasets (like bactKymo() on timelapse data, for instance), this can lead to memory problems, especially using Windows (opposed to Linux). I am working through them to see if they can be made more efficient in their garbage collection.

When you get a memory error (cannot allocate vector of ... Mb), save the data you work with and restart your R session (restart(R)) to make sure all your memory is cleared. If you get the error even when you are working with a completely fresh R session, check what other processes are running on your computer which are using the memory you need.

Step by step Functions

To make sure you don't lose time or data while working, the functions dealing with TIFF data are split up into multiple functions. After importing a TIFF stack using extr_OriginalStack(), you can save your data before using extr_OriginalCells() to connect your cell outline data to the TIFF image stack. Afterwards, you can split up the bactKymo() function by first performing the prepForKymo() function (see the documentation of bactKymo() for more information). In this way, if you want to make changes to your analysis, you never have to start from scratch.

⚠️ **GitHub.com Fallback** ⚠️