Show simple item record

dc.contributor.authorStone, William
dc.date.accessioned2019-12-18T12:33:42Z
dc.date.available2019-12-18T12:33:42Z
dc.date.issued2019-12-18T12:33:42Z
dc.identifier.urihttp://hdl.handle.net/10222/76834
dc.description.abstractPredictive genetics is a promising field of research, particularly in medical science where the ability to identify disease or treatment response could provide novel methods of mitigating their negative effects. Machine learning represents the most obvious tool that can be used to this end, however a notable property of genetic data that proves difficult for machine learning is a significant imbalance between samples and features, indicating the need for feature selection. The dataset we used was collected from multiple international centres and includes subjects with bipolar disorder, some of whom respond to the drug lithium and some who do not. We first select the features that were measured jointly by each data collection centre and show that above chance classification is possible with these data, despite significant overfitting which indicated the need for further feature space reduction. We then introduce a novel method capable of reducing the number of features even further so as to be bounded by the number of subjects. This method uses the hierarchical structure of genetic data to select feature subsets and evaluate their fitness individually before including the best ones in the final feature set. We show that our method improves on the first method while maintaining biological interpretability.en_US
dc.language.isoenen_US
dc.subjectgeneticsen_US
dc.subjectmachine learningen_US
dc.subjectbipolar disorderen_US
dc.subjectdrug responseen_US
dc.titleBiologically Informed Feature Selection in Large Scale Genomicsen_US
dc.date.defence2019-12-06
dc.contributor.departmentFaculty of Computer Scienceen_US
dc.contributor.degreeMaster of Scienceen_US
dc.contributor.external-examinern/aen_US
dc.contributor.graduate-coordinatorMichael McAllisteren_US
dc.contributor.thesis-readerDr. Sageev Ooreen_US
dc.contributor.thesis-readerDr. Martin Aldaen_US
dc.contributor.thesis-supervisorDr. Thomas Trappenbergen_US
dc.contributor.ethics-approvalNot Applicableen_US
dc.contributor.manuscriptsNoen_US
dc.contributor.copyright-releaseNot Applicableen_US
 Find Full text

Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record