Repository logo
 

SEMI-PARAMETRIC PRINCIPAL COMPONENT ANALYSIS FOR POISSON COUNT DATA WITH APPLICATION TO MICROBIOME DATA ANALYSIS

dc.contributor.authorHuang, Tianshu Jr
dc.contributor.copyright-releaseNot Applicableen_US
dc.contributor.degreeMaster of Scienceen_US
dc.contributor.departmentDepartment of Mathematics & Statistics - Statistics Divisionen_US
dc.contributor.ethics-approvalNot Applicableen_US
dc.contributor.external-examinern/aen_US
dc.contributor.graduate-coordinatorJoanna Mills Flemmingen_US
dc.contributor.manuscriptsNot Applicableen_US
dc.contributor.thesis-readerChris Fielden_US
dc.contributor.thesis-readerEdward Suskoen_US
dc.contributor.thesis-supervisorHong Guen_US
dc.contributor.thesis-supervisorToby Kenneyen_US
dc.date.accessioned2017-09-01T17:53:45Z
dc.date.available2017-09-01T17:53:45Z
dc.date.defence2017-08-25
dc.date.issued2017-09-01T17:53:45Z
dc.description.abstractPrincipal Component Analysis (PCA) is a widely used tool for dimensional reduction and data visualization. However, it cannot be used directly for microbiome data. In this thesis, we aim to develop PCA for the underlying abundance of OTUs under the assumption that conditional on the latent OTU abundance, the observed counts follow independent Poisson distributions. By correcting this Poisson measurement error, we base our PCA on an unbiased estimator of the covariance matrix of the latent OTU abundances. We further correct the sequencing depth noise by analyzing the data as compositional. In order to deal with the non-normality, we propose a logarithm-transformed Poisson-corrected PCA. We then incorporate sequencing depth correction into this method. Finally, we address the problem of projecting the observed data onto the log-transformed principal component space. We examine the performance of our methods on simulated data and tongue microbiomes data.en_US
dc.identifier.urihttp://hdl.handle.net/10222/73280
dc.language.isoenen_US
dc.subjectPCAen_US
dc.subjectMicrobiome data analysisen_US
dc.subjectPoisson noiseen_US
dc.subjectSequencing depthen_US
dc.titleSEMI-PARAMETRIC PRINCIPAL COMPONENT ANALYSIS FOR POISSON COUNT DATA WITH APPLICATION TO MICROBIOME DATA ANALYSISen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Huang-Tianshu-MSc-STAT-August-2017.pdf
Size:
698.89 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: