Repository logo
 

GENE CLUSTERING BASED ON CO-OCCURRENCE WITH CORRECTION FOR COMMON EVOLUTIONARY HISTORY

dc.contributor.authorLiu, Chaoyue
dc.contributor.copyright-releaseYesen_US
dc.contributor.degreeMaster of Scienceen_US
dc.contributor.departmentDepartment of Mathematics & Statistics - Statistics Divisionen_US
dc.contributor.ethics-approvalNot Applicableen_US
dc.contributor.external-examinern/aen_US
dc.contributor.graduate-coordinatorJoanna Mills-Flemmingen_US
dc.contributor.manuscriptsNot Applicableen_US
dc.contributor.thesis-readerJoseph Bielawskien_US
dc.contributor.thesis-readerTobias Kenneyen_US
dc.contributor.thesis-supervisorHong Guen_US
dc.contributor.thesis-supervisorRobert Beikoen_US
dc.date.accessioned2016-04-25T16:29:29Z
dc.date.available2016-04-25T16:29:29Z
dc.date.defence2016-04-12
dc.date.issued2016-04-25T16:29:29Z
dc.description.abstractAs the number of sequenced genomes increases rapidly, new approaches are needed for the computational annotation of protein functions and to better understand the ecological roles of genomes. In this thesis, a gene clustering approach based on the correlated evolution method (Pagel) and hierarchical clustering is proposed to find sets of co-occurring genes according to their weighted phylogenetic profiles. Hierarchical clusters can be cut at many different levels of similarity; since our primary interest is the evaluation of functional associations, we used the semantic similarity of Gene Ontology terms to optimize the choice of cuts in the hierarchy, and to evaluate our clustering outcomes. The results can be used to predict the functions of the unannotated genes and to discover candidate sets of lateral gene transfer events. We applied this approach to the gene set of the large clostridial genome “Lachnospiraceae bacterium 3-1-57FAA-CT1”, and generated informative clusters of genes with correlated evolutionary histories, which in many cases shared functional similarity as well. We compared the results of our method to the recently described approach, Clustering by Inferred Modules of Evolution (CLIME), and found considerable similarity between the two sets of predictions. However, our hierarchical clustering approach allows the exploration of degrees of protein similarity, and the generation of smaller or larger clusters as appropriate. In both cases, we found strong evidence that clusters of genes having similar phylogenetic histories also tend to be functionally linked.en_US
dc.identifier.urihttp://hdl.handle.net/10222/71495
dc.language.isoenen_US
dc.subjectphylogenyen_US
dc.subjectGene clusteringen_US
dc.subjectstatisticsen_US
dc.titleGENE CLUSTERING BASED ON CO-OCCURRENCE WITH CORRECTION FOR COMMON EVOLUTIONARY HISTORYen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Liu-Chaoyue-MSc-Statistics-April-2016.pdf
Size:
1.44 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: