Genomic Characterization of Five Isolates of Blastocystis Subtype 7
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Blastocystis spp. are common, genetically diverse anaerobic colonizers of the gastrointestinal tracts of both humans and animals. Although they are traditionally regarded as parasites, their precise role in host health remains unclear, in part because cryptic genetic diversity is masked by morphological similarity among divergent lineages. Subtype 7 (ST7) contains several isolates (C, H, E, G and B) that have been shown to exhibit distinct phenotypic differences in experimental settings, suggesting genetic diversity within the subtype. This thesis presents a comparative genomic analysis of those five ST7 isolates, with a focus on genetic differences between the isolates. Genomes of four of the five isolates presented here are newly sequenced using a long-read-based assembly approach resulting in new genomic assemblies that are >98% complete and range from 19.8 (ST7C) to 20.2 (ST7G) Mbp in size and were found to be highly syntenic. Short-read data suggests these Blastocystis ST7 isolates are haploid. For two of the newly sequenced genomes, ST7C and ST7G, the assembly was complete enough to capture 8 and 9 complete chromosomes respectively. Through comparisons of shared syntenic scaffolds amongst isolates, 16 nuclear chromosomes could be hypothetically reconstructed in ST7C. Comparative analyses revealed clear evidence of genomic rearrangements in ST7G and ST7H relative to ST7C. Manual curation was used to create a high-quality gene annotation for ST7C that was propagated to the other genomes yielding 8625-8896 predicted genes per genome. The predicted gene set for ST7B contained 2593 more genes than a previously published genome analysis for this isolate. A pangenome analysis of the five isolates was conducted and a small (228) accessory genome was identified compared to a far larger core gene set (6198) shared by all the isolates. An abundance of endogenous viral elements belonging to the Midsized Eukaryotic Linear dsDNA (MELD) virus class were found and comparative analyses suggest that the MELD viruses first invaded the common ancestor of ST7 and ST6, and then greatly proliferated in the ST7 genomes. The analyses presented here offers insight into the genomic structure and differences between these Blastocystis ST7 isolates.
Description
Blastocystis is a common eukaryotic member of the gut microbiome that has uncertain impacts on human health. In this thesis, the genomes of five isolates of Blastocystis subtype 7 (ST7) are compares, four of which are newly sequenced for this project. Gene predictions were created for all, or updated in the case of ST7B. Few genomic differences were identified between the five isolates and the isolates were found to be highly similar to one another. A description of the marcoscale genomic characteristics was included in this work, indicating that these isolates were most likely haploid and ST7C has roughly 16 nuclear chromosomes and a single Mitochondrion Related Organelle genome. Of interest is the extensive copies of integrated Midsized Eukaryotic Linear dsDNA virus sequences in ST7, particularly in contrast to other subtypes. This is suggestive that the presence of these sequences is somewhat unique for ST7 and distinguishes it from the other subtypes.
Keywords
Blastocystis, Genomics, Sequencing, Subtype 7, Comparative Genomics
