The Exploration of Effect of Model Misspecification and Development of an Adequacy-Test for Substitution Model in Phylogenetics
It is possible that the maximum likelihood method can give an inconsistent result when the DNA sequences are generated under a tree topology which is in the Felsentein Zone and analyzed with a misspeci ed model. Therefore, it is important to select a good substitution model. This thesis rst explores the e ects of di erent degrees and types of model misspeci cation on the maximum likelihood estimates. The results are presented for tree selection and branch length estimates based on simulated data sets. Next, two Pearson's goodness-of- t tests are developed based on binning of site patterns. These two tests are used for testing the adequacy of substitution models and their performances are studied on both simulated data sets and empirical data.