Show simple item record

dc.contributor.authorDriscoll, Stephen
dc.date.accessioned2019-12-13T13:33:17Z
dc.date.available2019-12-13T13:33:17Z
dc.date.issued2019-12-13T13:33:17Z
dc.identifier.urihttp://hdl.handle.net/10222/76782
dc.description.abstractWith high-dimensional measurements becoming increasingly common in chemistry, the efficient extraction of meaningful information from chemical data has never been more important. Chemometrics, a sub-discipline of analytical chemistry, emerged from the need for more advanced multivariate data analysis methods capable of solving more complex chemical problems. The goal of chemometrics can simply be stated as the differentiation between chemical variance and the variance due to measurement error. All analytical measurements are subject to errors, sometimes called noise, that contribute uncertainty to any type of analysis. The current state of the literature lacks both realistic noise simulation in the evaluation of new algorithms, as well as approachable methods to perform such noise simulation. Chapters 2, 3, and 4 of this thesis address these shortcomings. Chapters 2 and 3 describe a simple method for simulating realistic analytical measurement errors while Chapter 4 describes a method for accommodating different error structures in the analysis of fused multivariate data, an advance that circumvents the need for complicated preprocessing of these increasingly common data structures. Although many advances have been made in developing new algorithms that provide meaningful results when exploring modern chemical data sets, variance-based methods, such as principal component analysis (PCA), still dominate the field. A promising alternative algorithm that is not based on variance is projection pursuit analysis (PPA). However, due to the nature of the ordinary PPA algorithm, it requires the use of PCA when there are many response variables with respect to samples, which is the case in most multivariate chemical data sets. Chapter 5 and 6 address this issue by proposing a sparse PPA algorithm that is independent of PCA and is shown to reveal meaningful results where PCA and ordinary PPA cannot. Another issue with ordinary PPA is that it performs poorly when applied to unbalanced data sets or data sets with a number of classes not equal to a power of 2. Chapter 7 addresses this issue by implementing an augmentation strategy that allows for the analysis of unbalanced data and the sequential extraction of clusters with projection pursuiten_US
dc.language.isoenen_US
dc.subjectchemometricsen_US
dc.subjectmultivariate statisticsen_US
dc.subjectanalytical chemistryen_US
dc.titleEXPLORATION OF MULTIVARIATE CHEMICAL DATA IN NOISY ENVIRONMENTS: NEW ALGORITHMS AND SIMULATION METHODen_US
dc.date.defence2019-12-09
dc.contributor.departmentDepartment of Chemistryen_US
dc.contributor.degreeDoctor of Philosophyen_US
dc.contributor.external-examinerPeter Harringtonen_US
dc.contributor.graduate-coordinatorPeng Zhangen_US
dc.contributor.thesis-readerAlan Doucetteen_US
dc.contributor.thesis-readerErin Johnsonen_US
dc.contributor.thesis-readerMichael Dowden_US
dc.contributor.thesis-supervisorPeter Wentzellen_US
dc.contributor.ethics-approvalNot Applicableen_US
dc.contributor.manuscriptsYesen_US
dc.contributor.copyright-releaseYesen_US
 Find Full text

Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record