The Exploration of Effect of Model Misspecification and Development of an Adequacy-Test for Substitution Model in Phylogenetics
It is possible that the maximum likelihood method can give an inconsistent result when
the DNA sequences are generated under a tree topology which is in the Felsentein
Zone and analyzed with a misspeci ed model. Therefore, it is important to select a
good substitution model. This thesis rst explores the e ects of di erent degrees and
types of model misspeci cation on the maximum likelihood estimates. The results
are presented for tree selection and branch length estimates based on simulated data
sets. Next, two Pearson's goodness-of- t tests are developed based on binning of site
patterns. These two tests are used for testing the adequacy of substitution models and
their performances are studied on both simulated data sets and empirical data.
