Construction of amino acid rate matrices and extensions of the Barry and Hartigan model for phylogenetic inference
This thesis considers two distinct topics in phylogenetic analysis. The first is construction of empirical rate matrices for amino acid models. The second topic, which constitutes the majority of the thesis, involves analysis of and extensions to the BH model of Barry and Hartigan (1987). There are a number of rate matrices used for phylogenetic analysis including the PAM (Dayhoff et al. 1979), JTT (Jones et al. 1992) and WAG (Whelan and Goldman 2001). The construction of each of these has difficulties. To avoid adjusting for multiple substitutions, the PAM and JTT matrices were constructed using only a subset of the data consisting of closely related species. The WAG model used an incomplete maximum likelihood estimation to reduce computational cost. We develop a modification of the pairwise methods first described in Arvestad and Bruno that better adjusts for some of the sparseness difficulties that arise with amino acid data. The BH model is very flexible, allowing separate discrete-time Markov processes to occur along different edges. We show, however, that an identifiability problem arises for the BH model making it difficult to estimate character state frequencies at internal nodes. To obtain such frequencies and edge-lengths for BH model fits, we define a nonstationary GTR (NSGTR) model along an edge, and find the NSGTR model that best approximates the fitted BH model. The NSGTR model is slightly more restrictive but allows for estimation of internal node frequencies and interpretable edge lengths. While adjusting for rates-across-sites variation is now common practice in phylogenetic analyses, it is widely recognized that in reality evolutionary processes can change over both sites and lineages. As an adjustment for this, we introduce a BH mixture model that not only allows completely different models along edges of a topology, but also allows for different site classes whose evolutionary dynamics can take any form.