Data simulation - kreutz-lab/OmicsData GitHub Wiki
Two data simulation processes are available within the OmicsData environment. The first is based on O’Brien et al. (2018):
[full,data] = SimuDataOBrien(MV,MNAR,nfeat,nsamp);
with the percentage of MV and MNAR values relative to the MV, and the number of features and samples. The second data simulation is based on Lazar et al. (2016):
[full,data] = SimuDataLazar(MV,MNAR,nfeat,nsamp);
O’Brien, J. J., Gunawardena, H. P., Paulo, J. A., Chen, X., Ibrahim, J. G., Gygi, S. P., and Qaqish, B. F. (2018). The effects of nonignorable missing data on label-free mass spectrometry proteomics experiments. Ann Appl Stat, 12(4), 2075 – 2095.
Lazar, C., Gatto, L., Ferro, M., Bruley, C., and Burger, T. (2016). Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies. J Proteome Res, 15, 1116–1125.