EXPLORATION OF MULTIVARIATE CHEMICAL DATA IN NOISY ENVIRONMENTS: NEW ALGORITHMS AND SIMULATION METHOD

Driscoll, Stephen

dc.contributor.author	Driscoll, Stephen
dc.date.accessioned	2019-12-13T13:33:17Z
dc.date.available	2019-12-13T13:33:17Z
dc.date.issued	2019-12-13T13:33:17Z
dc.identifier.uri	http://hdl.handle.net/10222/76782
dc.description.abstract	With high-dimensional measurements becoming increasingly common in chemistry, the efficient extraction of meaningful information from chemical data has never been more important. Chemometrics, a sub-discipline of analytical chemistry, emerged from the need for more advanced multivariate data analysis methods capable of solving more complex chemical problems. The goal of chemometrics can simply be stated as the differentiation between chemical variance and the variance due to measurement error. All analytical measurements are subject to errors, sometimes called noise, that contribute uncertainty to any type of analysis. The current state of the literature lacks both realistic noise simulation in the evaluation of new algorithms, as well as approachable methods to perform such noise simulation. Chapters 2, 3, and 4 of this thesis address these shortcomings. Chapters 2 and 3 describe a simple method for simulating realistic analytical measurement errors while Chapter 4 describes a method for accommodating different error structures in the analysis of fused multivariate data, an advance that circumvents the need for complicated preprocessing of these increasingly common data structures. Although many advances have been made in developing new algorithms that provide meaningful results when exploring modern chemical data sets, variance-based methods, such as principal component analysis (PCA), still dominate the field. A promising alternative algorithm that is not based on variance is projection pursuit analysis (PPA). However, due to the nature of the ordinary PPA algorithm, it requires the use of PCA when there are many response variables with respect to samples, which is the case in most multivariate chemical data sets. Chapter 5 and 6 address this issue by proposing a sparse PPA algorithm that is independent of PCA and is shown to reveal meaningful results where PCA and ordinary PPA cannot. Another issue with ordinary PPA is that it performs poorly when applied to unbalanced data sets or data sets with a number of classes not equal to a power of 2. Chapter 7 addresses this issue by implementing an augmentation strategy that allows for the analysis of unbalanced data and the sequential extraction of clusters with projection pursuit	en_US
dc.language.iso	en	en_US
dc.subject	chemometrics	en_US
dc.subject	multivariate statistics	en_US
dc.subject	analytical chemistry	en_US
dc.title	EXPLORATION OF MULTIVARIATE CHEMICAL DATA IN NOISY ENVIRONMENTS: NEW ALGORITHMS AND SIMULATION METHOD	en_US
dc.date.defence	2019-12-09
dc.contributor.department	Department of Chemistry	en_US
dc.contributor.degree	Doctor of Philosophy	en_US
dc.contributor.external-examiner	Peter Harrington	en_US
dc.contributor.graduate-coordinator	Peng Zhang	en_US
dc.contributor.thesis-reader	Alan Doucette	en_US
dc.contributor.thesis-reader	Erin Johnson	en_US
dc.contributor.thesis-reader	Michael Dowd	en_US
dc.contributor.thesis-supervisor	Peter Wentzell	en_US
dc.contributor.ethics-approval	Not Applicable	en_US
dc.contributor.manuscripts	Yes	en_US
dc.contributor.copyright-release	Yes	en_US

Find Full text

Files in this item

Name:: Driscoll-Stephen-PhD-CHEM-Dece ...
Size:: 14.84Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Faculty of Graduate Studies Online Theses

Show simple item record