datasource rethink - geoscience-community-codes/GISMO GitHub Wiki
background
Datasource has been one of the most important aspects of the waveform suite. At the same time, it is also the most convoluted. Time to clear it up.
Proposal
Recreate datasource functionality in a smarter way.
datasource - a virtual class that defines the interface for (these aren't implemented here):
- load - instructions on how to retrieve data
- save - unimplemented. Will it be implemented? I don't know.
- parse - from the loaded state, turn this into some standard matlab structure (?)
- presearchcriteria - how to subset the data that will be transferred. -- example. in a .mat file, one might wish to specify ahead of time which variable names or types are of interest.
- postsearchcriteria -
- safetycheck - do some sort of test to make sure we can proceed without error or segfault. This might involve checking to see whether data exists, or who-knows-what.
Then, from datasource, we derive:
- FileIsData - Where the data file itself is of interest. ex. SAC files.,
- matFile - Where we're interested in matlab variables stored inside the file
- online - Something like retrieving data from IRIS' web services.
- database - Interface with some database, such as Antelope.
- json - ?? might go under FileIsData or might relate to matFile, may be another category. Not sure yet.
- ??? - unk.
example matFile workflow
- Figure out filename ->
- Look in file for variables of interest ->
- load specific variables ->
- subset variables based on some criteria
- using class-specific routine, parse variables
example database workflow
- Query for data
- Check to see if data exists / is loadable
- Load class-specific data
- using class-specific routine, parse variables
Philosophy
The new datasource only knows where to find data and how to retrieve it. It doesn't know or care about the type of data itself. i.e. It has to stay decoupled from specific classes, such as catalog, waveform, jumpinggif, etc.
a class could register its parsing function with the datasource,(as well as subsetting routines?)
Looking for feedback.
Please see the "datasource rework" issue and comment.