Repository logo
 

SEMI-PARAMETRIC PRINCIPAL COMPONENT ANALYSIS FOR POISSON COUNT DATA WITH APPLICATION TO MICROBIOME DATA ANALYSIS

Date

2017-09-01T17:53:45Z

Authors

Huang, Tianshu Jr

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Principal Component Analysis (PCA) is a widely used tool for dimensional reduction and data visualization. However, it cannot be used directly for microbiome data. In this thesis, we aim to develop PCA for the underlying abundance of OTUs under the assumption that conditional on the latent OTU abundance, the observed counts follow independent Poisson distributions. By correcting this Poisson measurement error, we base our PCA on an unbiased estimator of the covariance matrix of the latent OTU abundances. We further correct the sequencing depth noise by analyzing the data as compositional. In order to deal with the non-normality, we propose a logarithm-transformed Poisson-corrected PCA. We then incorporate sequencing depth correction into this method. Finally, we address the problem of projecting the observed data onto the log-transformed principal component space. We examine the performance of our methods on simulated data and tongue microbiomes data.

Description

Keywords

PCA, Microbiome data analysis, Poisson noise, Sequencing depth

Citation