GEOREFERENCED TREES AND THE PHYLOGENETIC SIMILARITY OF BIOLOGICAL COMMUNITIES
Culture-independent DNA sequencing is being used to recover genetic material directly from environmental samples. This has spurred large-scale community efforts to catalogue the diversity of life and its geographic distribution using molecular data. These initiatives stand to revolutionize our understanding of the processes that shape biodiversity and may ultimately provide critical information for setting public health, environmental, and economic policies. To achieve these aims new tools are required to effectively explore these large biogeographic datasets. This thesis introduces a novel technique for visualizing hierarchically organized data in a geographic context that illustrates the influence of a geographic or environmental gradient on the phylogenetic relationships between organisms or the similarity of biological communities. This technique is incorporated into GenGIS, open-source software that supports the integration of digital map data with genetic sequences and environmental information from multiple sample sites. GenGIS addresses the need for an interactive geospatial analysis environment capable of handling large biogeographic datasets where a wealth of sequence data is available for each sample site. This is accomplished through a rich set of analysis options that produce georeferenced visualizations for data exploration and hypothesis generation. Studies conducted by myself and other research groups have used GenGIS to investigate the diversity of viruses, bacteria, plants, animals, and even language families. I then explore measures of beta diversity that aim to assess the influence of geographic or environmental gradients on the similarity of biological communities. This thesis examines phylogenetic beta-diversity measures that determine community variation by considering the relationships between organisms in a phylogenetic tree. A large comparative study is performed in order to assess specific properties and performance characteristics of these measures. Many measures of phylogenetic beta diversity were found to be robust to sequence clustering, the addition of an outlying basal lineage, root placement, and the presence of rare organisms. Additionally, performance was found to differ substantially under different models of community variation. This thesis then describes how an important class of phylogenetic beta-diversity measures can be calculated over phylogenetic networks in order to account for uncertainty and conflict in inferred ancestral relationships.