Getting Raw Data Out of Matlab - UVA-CAMA/NICUHDF5Viewer GitHub Wiki

This page is in the advanced users section because I HIGHLY HIGHLY recommend for MOST cases that you do whatever processing you want to do within the BAP. I know using someone's git repository can be scary and seem intimidating, but I promise it is built with the perks in mind of:

  1. Sharing algorithms between institutions - you can package up your new algorithm in the BAP and send it to someone else to run who doesn't even have Matlab.
  2. People can run your code using a GUI - no programming required.
  3. You don't have to know the directory structure of your end-user because the BAP GUI has the end-user select their files of interest for you.
  4. You don't have to deal with large file data processing on your own.
  5. You can share your code with other sites via a git pull request - incorporating your algorithm's capability into everyone in the future who wants to use the BAP.
  6. You can view your raw hdf5 file data alongside the results with the viewer.
  7. The results files are small and easy to share.
  8. You can have all your data from multiple institutions in a single result file format.

This code repository is a lot and I just want to do something simple

Nothing is simple.

Prepare yourself :)

Some of you will say, yeah, but I don't want to figure all that out and it would just be easier for me to look at the raw data so I don't have to learn all that stuff! I GET IT. IT IS A LOT. You are right it would be faster to get "something" running on a raw data file than learning to incorporate it here. If you want a brady algorithm you could just say data<threshold and be done! Wow! Fast! However, as you build complexity into your algorithms, are dealing with signals with different timestamps, trying to merge events together, dealing with different sampling frequencies from different sites, dealing with files from different sites, dealing with merging data from different files, wanting to browse your data and results alongside eachother, adding different filtering algorithms, and managing different file structures, it all gets more complicated.

For this reason, I would highly encourage you to try it out and get to know a bit about what has been built here before you start rebuilding it on your own. My suspicion is that a lot of what you want to do might already have been done here. I don't want you to waste your time reinventing this whole process. I don't think what I have done here is perfect by any stretch of the imagination. I understand it has its limitations and weaknesses. However, 97% of the requests I receive regarding: "I want to do this thing so I need the raw data" have already been handled here, which I why I write this paragraph of encouragement. I hope it doesn't come across as aggressive or demeaning in any way. I know this has weaknesses.

Some reasons you might need to get the raw data out of Matlab

Some of you have a genuine reason why you might actually need to get the data out. I recommend you do this with caution. You will lose out on all the functionality I listed above. This could be for you if:

  1. You have a super cool function that is implemented in another language that MATLAB is incompatible with (Note: matlab can call functions from a lot of other lanugages: https://www.mathworks.com/help/matlab/matlab_external/integrate-matlab-with-external-programming-languages-and-systems.html)
  2. You want to rewrite the BAP to work in some containerized fashion to be more efficient
  3. You want to build a better gui beyond Matlab's limitations
  4. You are a computer scientist who wants to re-write stuff to be faster/better/stronger (or you are just smarter than me and know how to do this - please go ahead and make this better!)
  5. You want to iterate on the development of an algorithm and don't want to do every iteration through the BAP GUI interface. You can use this to get some raw data into the workspace, test and develop your algorithm, then add that algorithm into the BAP pipeline later.

An example to get the data and time stamps for heart rate

This will get the data out in a way that has all of Doug's really smart filtering/timestamp correction already done.

You will need to enter filename in the following format 'X:\Folder\YourFile'

info=getfileinfo(filename);
[data,~,info] = getfiledata(info,'HR');
[data,~,~] = formatdata(data,info,3,1);
hrdata = data.x;
vt = data.t;

There are a lot of options within formatdata.m on how you want your data to be returned. The options I use for the algorithms within the BAP are tformat = 3 and dformat = 1 as shown above. For more information about the other input options here, please read the documentation at the top of the formatdata.m function.

Note that formatdata.m also has a rawfile option if you want the truly raw data.

From here if you want to export the raw data to another file format, you can do that however you wish. Matlab has many options. Note that though csv's can be tempting, they can be huge for waveform signals. Like really huge. Proceed with caution.

What if I want something other than heart rate?

See this wiki page for the other options: https://github.com/UVA-CAMA/NICUHDF5Viewer/wiki/Variable-Names