Show simple item record

dc.contributor.authorWong, Dennis H.-J.
dc.date.accessioned2019-11-28T14:50:38Z
dc.date.available2019-11-28T14:50:38Z
dc.date.issued2019-11-28T14:50:38Z
dc.identifier.urihttp://hdl.handle.net/10222/76663
dc.description.abstractInterest in microbial life and the progress of DNA sequencing technology has led to thousands of sequenced bacterial genomes. In this thesis I develop approaches to identify Lateral Gene Transfer (LGT) in metagenomes, develop fast sequence clustering approaches to create clusters necessary in comparative genomics analyses, and apply them to large data sets. In chapter two, I identify LGT in two of three metagenomes of phosphorus-removing bacteria in sewage-treatment plants, none in a United States of America community, two in a Danish community and five in an Australian community. Analyses account for the limitations of metagenomic sequence data and focus on gene transfers in energy-related metabolic pathways. These transfers impact pathways associated with the different input carbon feeds for each community, suggesting recent adaptation among community members. This is the first published analysis focusing on the role and direction of transferred genes in a community using metagenomes. In chapter three, I develop two methods to define and refine clusters of homologous sequences from sequenced genomes: ProPhylClust to identify large protein families, and PhyloSubClust to subcluster large protein families based on phylogeny to recover orthologous relationships. ProPhylClust uses a species phylogeny as a guide tree for runtimes with approximately linear scaling relative to the runtimes of all-versus-all homology-search methods that scale quadratically with increasing numbers of genomes. Two different sets of genomes were used, one spanning 24 bacterial phyla and the other sampled from the phylum Proteobacteria. While the sequence comparisons in ProPhylClust make it slower than competing approaches on small genome sets, the hierarchical approach of ProPhyClust yielded equal or faster runtimes on sets with 100 or more genomes. In chapter four, 558 incomplete and complete genomes from the class Clostridia were clustered using ProPhylClust and PhyloSubClust. Of 18 clusters containing toxin proteins and their regulators from Peptoclostridium difficile (toxins A/B), Clostridium botulinum (botulinum toxin) and Clostridium tetani (tetanus toxin), one botulinum-tetani toxin cluster and a toxin A/B cluster, revealed homologous sequences considered non-toxic. Hierarchical clustering of phylogenetic profiles identified potentially toxin-related protein families with unknown function located on the same sequence contig or chromosome, but not in toxin operons. The computational analysis of large genomic data sets to derive biologically relevant knowledge will continue to be a challenge for years to come. Here, I focused on computational methods relevant to identifying LGT in environmental sequence data, constructing clusters of homologous sequences from genomes, and obtaining functionally associated sequences based on phylogenetic distributions. Promising results were produced for each chapter, with gene transfer events found in phosphorus removing sewage treatment communities, runtimes for cluster construction that are more manageable than other methods with larger data sets, and sequences that possibly are functionally relevant to toxins in C. botulinum and P. difficile.en_US
dc.language.isoenen_US
dc.subjectMetagenomicsen_US
dc.subjectGenomicsen_US
dc.subjectBacteriaen_US
dc.subjectLateral Gene Transferen_US
dc.subjectClusteringen_US
dc.subjectOrthologyen_US
dc.subjectHomologyen_US
dc.titleINFERRING ORTHOLOGOUS RELATIONSHIPS AND GENE TRANSFER IN MICROBIAL GENOMES AND METAGENOMESen_US
dc.date.defence2018-04-11
dc.contributor.departmentInterdisciplinary PhD Programmeen_US
dc.contributor.degreeInterdisciplinary PhDen_US
dc.contributor.external-examinerDr. Gabriel Moreno-Hagelsieben_US
dc.contributor.graduate-coordinatorDr. Lynne Robinsonen_US
dc.contributor.thesis-readerDr. Norbert Zehen_US
dc.contributor.thesis-readerDr. Christian Blouinen_US
dc.contributor.thesis-readerDr. Joseph Bielawskien_US
dc.contributor.thesis-supervisorDr. Robert Biekoen_US
dc.contributor.ethics-approvalNot Applicableen_US
dc.contributor.manuscriptsYesen_US
dc.contributor.copyright-releaseNoen_US
 Find Full text

Files in this item

Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record