Phylogenetic analysis of multiple genes based on spectral methods
Date
2011-11-14
Authors
Abeysundera, Melanie
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Multiple gene phylogenetic analysis is of interest since single gene analysis often
results in poorly resolved trees. Here the use of spectral techniques for analyzing
multi-gene data sets is explored. The protein sequences are treated as categorical
time series and a measure of similarity between a pair of sequences, the spectral
covariance, is used to build trees. Unlike other methods, the spectral covariance
method focuses on the relationship between the sites of genetic sequences.
We consider two methods with which to combine the dissimilarity or distance
matrices of multiple genes. The first method involves properly scaling the dissimilarity
measures derived from different genes between a pair of species and using the
mean of these scaled dissimilarity measures as a summary statistic to measure the
taxonomic distances across multiple genes. We introduced two criteria for computing
scale coefficients which can then be used to combine information across genes, namely
the minimum variance (MinVar) criterion and the minimum coefficient of variation
squared (MinCV) criterion. The scale coefficients obtained with the MinVar and
MinCV criteria can then be used to derive a combined-gene tree from the weighted
average of the distance or dissimilarity matrices of multiple genes.
The second method is based on the singular value decomposition of a matrix made
up of the p-vectors of pairwise distances for k genes. By decomposing such a
matrix, we extract the common signal present in multiple genes to obtain a single tree
representation of the relationship between a given set of taxa. Influence functions for
the components of the singular value decomposition are derived to determine which
genes are most influential in determining the combined-gene tree.
Description
Keywords
Phylogenetics, spectral analysis, spectral covariance, singular value decomposition