Filtering Microarray Correlations by Statistical Literature Analysis Yields Potential Hypotheses for Lactation Research. - mauriceling/mauriceling.github.io GitHub Wiki

Citations: Ling, MHT, Lefevre, C, Nicholas, KR. 2008. Filtering Microarray Correlations by Statistical Literature Analysis Yields Potential Hypotheses for Lactation Research. The Python Papers 3(3): 4.

Link to [Abstract] and [PDF]

Here is a permanent link to this [PDF] in my own archive.

Besides NLP, statistical linguistics which depends on the appearance of words or names in text has been used to extract potential protein-protein interactions, such as in the case of PubGene and CoPub Mapper. In the case of PubGene, it was found that the presence of 2 protein names in 1 abstract out of 10 million (1-PubGene) suggest 60% likelihood of interaction and increases to 72% when the names appears 5 times or more (5-PubGene). This manuscript analyzed PubGene methods using Poisson distribution and found that 1-PubGene is generally more stringent that 99% confidence on Poisson distribution; thus, explaining 1-PubGene's expectedly good performance. This study demonstrated that NLP extracted interactions were almost a proper subset of statistical extraction, suggesting that NLP can be used to annotate statistical extractions. This study also found that a majority of co-expressed genes from microarray analysis, including 7 pairs of perfectly co-expressed genes, were not mentioned in text, suggesting that these potential interactions had not been studied experimentally. Hence, we suggest that text mining may be used to construct a "state of current knowledge" suitable to identify potential hypotheses for further experimental research.