On Models for Detecting Evidence of Molecular Adaptation in Homologous Sequences of Protein Coding Genes

Jones, Chris

dc.contributor.author	Jones, Chris
dc.date.accessioned	2019-08-30T11:03:03Z
dc.date.available	2019-08-30T11:03:03Z
dc.date.issued	2019-08-30T11:03:03Z
dc.identifier.uri	http://hdl.handle.net/10222/76361
dc.description.abstract	Codon substitution models (CSMs) are commonly fitted to alignments of homologous protein-coding sequences with the objective of determining whether sites in the gene underwent positive selection. Under the standard paradigm such evidence is often assumed to be enough to conclude the gene evolved adaptively. CSMs are commonly validated using simulated alignments. A central theme of this dissertation is use of relatively realistic alignment-generating processes grounded in mutation-selection (MS) theory (Chapter 1). The MS framework permits sites to be evolved each on their own site-specific fitness landscape defined by a vector of fitness coefficients for the twenty amino acids. A novel MS alignment-generating process was used to show that evidence for variation in site-specific rate ratios (a.k.a. heterotachy) with episodic positive selection can be produced by episodic adaptive changes in site-specific fitness coefficients, consistent with the standard paradigm, but also by a second previously unrecognized process that I call non-adaptive shifting balance. This finding undermines sophisticated CSMs specifically designed to infer episodic adaptation by detecting heterotachy with episodic positive selection (Chapter 2). Processes that tend to generate similar patterns in data are said to be confounded. Confounding can lead to a novel statistical pathology that I call phenomenological load. A series of novel CSMs fitted to alignments generated under a version of MS uniquely formulated to mimic real data were used to demonstrate that phenomenological load can lead to false biological conclusions. These analyses were accompanied by a novel method to assess the potential impact of phenomenological load on any given model parameter (Chapter 3). Confounding of adaptive and non-adaptive processes that generate heterotachy can be avoided by abandoning positive selection as an indicator of adaptation and instead using evidence of changes in site-specific amino acid fitnesses. This approach was realized by constructing the phenotype-genotype branch-site model (PG-BSM), a descendant of traditional branch-site models that combines alignment data with a discrete phenotype (i.e., contextual information) under a unified statistical framework. The PG-BSM was validated using extensive simulations and produced plausible results when applied to real data (Chapter 4). This dissertation ends with a discussion of implications of my findings (Chapter 5).	en_US
dc.language.iso	en	en_US
dc.subject	molecular evolution	en_US
dc.subject	adaptive evolution	en_US
dc.subject	codon substitution models	en_US
dc.subject	confounding	en_US
dc.subject	phenomenological load	en_US
dc.title	On Models for Detecting Evidence of Molecular Adaptation in Homologous Sequences of Protein Coding Genes	en_US
dc.type	Thesis	en_US
dc.date.defence	2019-08-21
dc.contributor.department	Department of Mathematics & Statistics - Statistics Division	en_US
dc.contributor.degree	Doctor of Philosophy	en_US
dc.contributor.external-examiner	Dr. Ziheng Yang, FRS	en_US
dc.contributor.graduate-coordinator	Dr. David Iron	en_US
dc.contributor.thesis-reader	Dr. Andrew Roger	en_US
dc.contributor.thesis-reader	Dr. Chris Field	en_US
dc.contributor.thesis-supervisor	Dr. Edward Susko	en_US
dc.contributor.thesis-supervisor	Dr. Joseph P. Bielawski	en_US
dc.contributor.ethics-approval	Not Applicable	en_US
dc.contributor.manuscripts	Yes	en_US
dc.contributor.copyright-release	No	en_US

Find Full text

Files in this item

Name:: Jones-Chris-PhD-Stats-August20 ...
Size:: 3.877Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Faculty of Graduate Studies Online Theses

Show simple item record