Show simple item record

dc.contributor.authorWhidden, Chris
dc.date.accessioned2013-08-15T18:59:56Z
dc.date.available2013-08-15T18:59:56Z
dc.date.issued2013-08-15
dc.identifier.urihttp://hdl.handle.net/10222/35349
dc.description.abstractRampant lateral gene transfer (LGT) among prokaryotes, hybridization in plants and other reticulate evolutionary processes invalidate typical phylogenetic tree models by violating the assumption that organisms only inherit genetic information from a single parent species. Comparing the different evolutionary histories of multiple genes is necessary to identify and assess these processes. In this work I develop efficient approximation and fixed-parameter algorithms for computing rooted maximum agreement forests (MAFs) and maximum acyclic agreement forests (MAAFs) of pairs of phylogenetic trees. Their sizes correspond to the subtree-prune-and-regraft (SPR) distance and the hybridization number of these pairs of trees, which are important measures of the dissimilarity of phylogenies used in studying reticulate evolution. Although these MAFs and MAAFs are NP-hard to compute, my fixed-parameter algorithms are practical because they scale exponentially with the computed distance rather than the size of the trees. I contribute efficient fixed-parameter algorithms for computing MAFs and MAAFs of two binary rooted trees and give the first efficient fixed-parameter and approximation algorithms for computing MAFs of two multifurcating rooted trees. My open-source implementation of the MAF algorithms is orders of magnitude faster than previous approaches, reducing the time required to compute SPR distances of 46 between trees of 144 species to fractions of a second whereas previous approaches required hours to compute SPR distances of 25. These fast MAF-based distance metrics enable the construction of supertrees to reconcile a collection of gene trees and rapid inference of LGT. Simulations demonstrate that supertrees minimizing the SPR distance are more accurate than other supertree methods under plausible rates of LGT. I constructed an SPR supertree from a phylogenomic dataset of 40,631 gene trees covering 244 genomes from several major bacterial phyla and inferred "highways" of gene transfer between these bacterial classes and genera; a small number of these highways connect distantly related genera and can highlight specific genes implicated in long-distance LGT. These fast MAF algorithms are thus practical and enable new analyses of reticulate evolution.en_US
dc.language.isoenen_US
dc.subjectsubtree-prune-and-regraft distanceen_US
dc.subjectlateral gene transferen_US
dc.subjectfixed-parameter tractabilityen_US
dc.subjectmaximum agreement foresten_US
dc.subjecthybridizationen_US
dc.titleEfficient Computation and Application of Maximum Agreement Forestsen_US
dc.date.defence2013-07-29
dc.contributor.departmentFaculty of Computer Scienceen_US
dc.contributor.degreeDoctor of Philosophyen_US
dc.contributor.external-examinerSteven Kelken_US
dc.contributor.graduate-coordinatorDirk Arnolden_US
dc.contributor.thesis-readerMeng Heen_US
dc.contributor.thesis-readerChristian Blouinen_US
dc.contributor.thesis-supervisorNorbert Zeh and Robert Beikoen_US
dc.contributor.ethics-approvalNot Applicableen_US
dc.contributor.manuscriptsYesen_US
dc.contributor.copyright-releaseYesen_US
 Find Full text

Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record