INFERENCE OF GENE COEVOLUTION BASED ON PHYLOGENETIC PROFILES
Date
2022-11-17
Authors
Liu, Chaoyue
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Phylogenetic profiles, which summarize the presence and absence patterns of genes in a set of genomes, can be used to identify genes that have correlated evolutionary histories. However, comparative analysis of phylogenetic profiles should take into account the phylogenetic effect under consideration. In this study, we developed phylogenetic comparative methods to infer the gene coevolution.
We first proposed an approach that uses Pagel's correlation model to infer the evolutionary similarities between genes and a hierarchical-clustering approach to define sets of genes with correlated distributions across the organisms. The results support the assumption of our work that the genes with correlated evolutionary histories tend to be functionally linked.
However, Pagel's method is computationally expensive and tends to overestimate the signal of coevolution. We developed a new coevolutionary model - the Community Coevolution Model (CCM), which has the additional advantage of being able to examine multiple genes as a community to reveal a more complete picture of the dependency relationships. We also developed a simulation procedure to generate phylogenetic profiles of gene sets with correlated evolutionary trajectories and adjustable strength of interactions. The results show that CCM is more accurate than Pagel's method and other heuristic tree-aware methods and provides more biological insights such as the evolutionary rates, significance levels and directions (positive/negative) of interactions.
We also developed a matrix decomposition-based method (Chisq-PLR), especially for large-scale analysis. Our method not only has computational speed that is competitive with other heuristic methods but also gives support to better biological explanations. This fast method can be used to preprocess large data sets to reduce the number of computations that need to be carried out by CCM or other mechanistic model based methods.
Description
Keywords
Phylogenetic Profiles, Evolutionary Model, Statistics, Comparative Methods