EVALUATING A MICROBIAL COMMUNITY THROUGH FEATURE MATCHING AND GRAPH TOPOLOGY
Porter, Michael Scott
MetadataShow full item record
Metagenomics, the sequencing of DNA from environmental samples, has enabled the study of cohabiting microorganisms with a single sequencing experiment. This requires algorithms and techniques specific to metagenomics: Since the environmental sample is not separated by organism before being sequenced, taxonomic classification is required to reveal the taxonomic composition of the sample. The metabolic function of the sample can be determined through functional annotation. Both of these analyses can be done through comparisons to a reference database of sequences with assigned taxonomy and function. Here new techniques for metagenomic analysis are developed. The KB-1 metagenome, representing a microbial community capable of converting toxic chlorinated ethenes into non-toxic ethene, is used as an example for these techniques to determine which KB-1 organism is capable of dechlorination and what metabolic support this organism gets from other community members to sustain its growth. A new rank-flexible taxonomic classification algorithm called SPANNER (Similarity Profile ANNotatER) is described. Traditional taxonomic classifiers are based on the similarity of a query sequence to sequences in a reference database. SPANNER uses all reference similarities as a feature vector of taxonomic affinities and classifies a query sequence based on affinity similarity. This approach is shown to be less sensitive to events such as lateral gene transfer which can confuse traditional classifiers. Classification using SPANNER is performed on the KB-1 metagenome. SPANNER offers greater control of the trade-off between precision and accuracy compared to other taxonomic classifiers; an appropriate level of precision can therefore be chosen based on the availability of closely related reference genomes. SPANNER classified many taxa at or within one or two ranks of the best possible rank. Cohabiting microorganisms may interact metabolically via “hand-off points,” the sharing of processed chemicals between organisms. Hand-off points could give an organism access to an otherwise inaccessible biochemical pathway or could split pathways between organisms. This can lead to a community forming where some community members depend on others to provide key metabolites that are essential for survival. A metabolic network representing KB-1 metabolism is reconstructed using newly proposed methods. The topology of this network is analyzed for metabolic interaction and dependencies between microbial organisms. The reconstruction of community metabolism suggests metabolic regions that are complementary or redundant between community members, and hand-off point identification suggests possible dependencies between organisms. This network has topological differences from metabolic networks of single organisms. Multiple events to the same genome that would normally confuse taxonomic classification, such as lateral gene transfer, create similar patterns of taxonomic affinity across that genome. SPANNER detects these patterns to avoid incorrect assignments from these events, for accurate KB-1 classification. The KB 1 metabolic network has high connectivity between metabolites caused by the complementarily of the metabolism of each community member. This network also identified several putative hand-off points between KB-1 community members, with accurate hand-off point detection being highly sensitive to missing or incorrect functional annotations.