基因水平转移(Horizontal gene transfer, HGT) - ricket-sjtu/bioinformatics GitHub Wiki
Horizontal gene transfer (HGT), a.k.a lateral gene transfer, has appeared to be one of the driving forces for prokaryotic evolution. However, it remains challenge to distinguish HGT from gene loss especially when we are analyzing a newly-sequenced genome containing some species-specific regions.
Classical methods
- The methods based on the phylogenetic incongruencies
- Pros: The intuition is easy to interpret according to the definition of the HGTs.
- Cons: A relatively large number of orthologs for the gene of interest is needed for rebuilding the phylogenetic tree that could be compared to the species tree.
- The window methods, e.g., tetranucleotides criterion
- The gene-based methods, e.g., codon-usage metric
- combined approach
Evaluation of the methods
- Kullback-Leibler divergence metric
- trade-off between the sensitivity and specificity
- window methods are very sensitive but less specific and thus detect badly lone-isolated genes
- gene-based methods are often specific but lack of sensitivity
Features used for discrimination
- GC%
- GC% for position 3 of the encoded codons
- (Normalized) dinucleotides: Pearson/Spearman's correlation, Mahalanobis, Covariance, Fisher's $\chi^2$, K-L divergence
- (Normalized) tetranucleotides: Pearson/Spearman's correlation, Mahalanobis, Covariance, Fisher's $\chi^2$, K-L divergence
- (Normalized) codon usage: Karlin, Fisher's $\chi^2$, K-L divergence, Mahalanobis