Repository logo

DalSpace Institutional Repository

DalSpace is a digital service that collects, preserves, and distributes digital material produced by the Dalhousie community.

  • To learn about content guidelines, policies, and how to deposit, view the Help documents.
  • Contact us to get started submitting content to DalSpace at dalspace@dal.ca

Contact Us | Send Feedback

Recent Submissions

  • Item type: Item , Access status: Open Access ,
    Neural Compression for Scalable Question-Answer Retrieval
    (2025-12-11) Khiet, Moamen; Yes; Master of Computer Science; Faculty of Computer Science; Not Applicable; n/a; Not Applicable; Dr. Hassan Sajjad; Dr. Frank Rudzicz; Dr. Ga Wu
    Question-answering systems at scale face fundamental performance barriers when traditional vector databases transition from exact to approximate search, causing substantial degradation in both query throughput and retrieval quality. While compression can address these challenges, existing compression approaches either apply generic transformations ignoring retrieval task structure (PCA) or require retraining entire embedding models (Matryoshka), limiting practical applicability. This thesis introduces neural compression for question-answer retrieval through two-stage learning that compresses 384-dimensional context embeddings to 32 or 64 dimensions while preserving semantic information. The approach trains an autoencoder to compress context embeddings using cosine similarity loss, then trains a mapper network to predict compressed codes directly from question embeddings using mean squared error loss in compressed space. Both networks are trained on the training split (72%, 86,400 pairs). We evaluate this approach on 120,000 question-answer pairs spanning six knowledge domains, comparing against six baseline methods (FAISS, HNSW, ScaNN, PCA, Matryoshka, zero-shot) across six dataset scales (20K to 120K) with three iterations per configuration, totaling in 198 experimental runs. Results demonstrate that neural compression achieves 0.1725 ROUGE-1 score compared to FAISS’s 0.1624 (+6.2% improvement) while reducing storage from 184 MB to 7.7 MB (96% reduction) and increasing throughput from 151 to 7,861 queries per second (52× speedup). Neural compression is the only method whose quality improves with scale (+2.5% from 20K to 120K samples) while all baselines degrade (-6% to -13%). The performance crossover occurs at approximately 40K samples, earlier than hypothesized, as FAISS quality degrades from curse of dimensionality effects before its algorithmic transition to approximation. These results show that task-specific learned compression through asymmetric architecture compressing only contexts while keeping questions full dimensional enables exact search at scales where high dimensional methods must approximate, fundamentally changing scalability characteristics of retrieval systems.
  • Item type: Item , Access status: Open Access ,
    Assessing the effects of sampling design and data integration in spatio-temporal fisheries models
    (2025-12-11) McDonald, Raphaël; No; Doctor of Philosophy; Department of Mathematics & Statistics - Statistics Division; Not Applicable; Dvora Hart; Yes; Orla Murphy; Craig Brown; Joanna Mills Flemming; David Keith
    Many communities around the world depend on the socio-cultural and economic benefits derived from fisheries. To ensure these continued benefits, fisheries management relies on accurate estimates of fish population size and health obtained through the stock assessment process. This thesis develops and improves upon various statistical methods used within the stock assessment process. We first show how common sampling designs do not strongly impact model results, but also how sampling effort allocation can introduce time-varying biases if the strata-specific sampling effort is not proportional to strata size. We next turn our attention to analyses that occur early in the stock assessment process, such as length-weight relationships. While complex statistical models are often necessary to estimate these relationships, their outputs are routinely used without accounting for their associated uncertainties. We present simple methods to highlight the increase in precision achieved by error propagation. The next two chapters focus on developing methods for the inclusion of novel types of data within stock assessment models. The first is a benthoscape map. Focusing on sea scallops in the Bay of Fundy, we demonstrate how benthoscape maps can be interpreted as habitat features and help inform the estimation of catchability and probabilities of encounter to improve parameter estimation and identify biologically meaningful area boundaries. The second is drop camera data. We propose various model modifications to account for the resolution difference between datasets. This work identifies a previously unknown relationship between the size of survey tows, the underlying spatial distribution of the population, and the accuracy of the survey. High aggregation or high noise in the spatial distribution results in the survey indices being more precise than the drop camera indices even with substantially smaller sample sizes. This thesis develops novel statistical methods useful to the broader field of fisheries science, with contributions for improved data integration, error propagation, and sampling design, while opening various new avenues of future work in all these same fields.
  • Item type: Item , Access status: Open Access ,
    GRPO-Rad: Group Relative Policy Optimization for Radiology Report Summarization
    (2025-12-10) Nassiri, Fargol; Not Applicable; Master of Computer Science; Faculty of Computer Science; Not Applicable; N/A; No; Vlado Keselj; Hassan Sajjad; Frank Rudzicz
    Radiology report summarization requires condensing detailed findings into concise impressions, a task where traditional supervised fine-tuning (SFT) often struggles to balance syntactic correctness, clinical accuracy, and brevity. This thesis investigates Group Relative Policy Optimization (GRPO) as a superior alternative, enabling direct optimization of a composite reward function combining ROUGE-L syntactic similarity and length constraint. Using the MIMIC-III dataset and Qwen 3.0 decoder-only models (0.6B and 1.7B parameters) with parameter-efficient LoRA fine-tuning, we systematically evaluate 24 configurations varying model size, prompting, and few-shot learning. Results demonstrate that GRPO consistently outperforms both zero-shot baseline and SFT across syntactic (ROUGE-L) and clinical (F1-RadGraph) metrics. The optimal GRPO configuration achieves 32.65 ROUGE-L and 30.28 F1-RadGraph, representing a 16% improvement over SFT with statistical significance (p < 0.05). This work presents the first application of GRPO to medical text, establishing it as a robust framework for clinical documentation tasks requiring multi-objective optimization.
  • Item type: Item , Access status: Open Access ,
    Design, Simulation, and Techno-Economic Evaluation of a Novel Process for Ethanol Recovery from Fermentation Broths
    (2025-12-09) Campos Assumpcao de Amarante, Rafael; Not Applicable; Doctor of Philosophy; Department of Process Engineering and Applied Science; Not Applicable; Dr. Xianshe Feng; Yes; Dr. Jan Haelssig; Dr. Darrel Doman; Dr. Adam Donaldson
    Addressing rising energy needs, particularly in industry, transportation, and heating, while simultaneously decreasing the rate of greenhouse gas emissions, is a critical challenge facing society nowadays. Liquid biofuels produced from renewable sources, such as ethanol, have received increased research attention because they show long-term viability potential and fit the current transportation infrastructure. Ethanol is produced by the anaerobic fermentation of sugars obtained from renewable biomass. Conventionally, ethanol is recovered from the aqueous fermentation broth using at least two distillation steps combined with a dehydration step. However, this configuration requires large amounts of energy and is particularly inefficient, mainly due to the low ethanol concentrations achieved in the fermentation broth, and the formation of an ethanol-water azeotrope. In this thesis, a new extraction-pervaporation system was proposed and investigated for use in ethanol recovery. In this process, ethanol is extracted from the fermentation broth using organic solvents and recovered from this mixture by pervaporation. It was envisioned that this new process configuration would lead to improved energy efficiency and economics for the ethanol production process. The proposed process was investigated using numerical and experimental techniques. New polymeric membranes were developed and were shown to be effective in separating ethanol/2-ethylhexanol and ethanol/pentanol mixtures. The permeates were obtained at similar or higher concentration than that achievable in enriching distillation columns. The pervaporation mass transfer process was modelled accounting for the effect of concentration polarization and for concentration-dependent diffusion in the membrane layer, and results showed that, under the conditions studied, the mass transfer resistance in the membrane controls the pervaporation process. It was also shown that higher temperatures and feed concentrations result in higher fluxes through the membrane. A pervaporation numerical model incorporating a membrane permeability model based on the results achieved in previous sections was developed. Ethanol extraction from the broth was also modelled, as well as the fermentation process. A techno-economic analysis was carried out and results indicated that, although the current economic performance is lacking, the extraction-pervaporation process is more energetically efficient than the distillation process, achieving similar ethanol concentrations while requiring less than half of the amount of energy input.
  • Item type: Item , Access status: Embargo ,
    ANALYSIS OF SUSTAINABLE MINING SUPPLY CHAINS
    (2025-12-10) Amegboleza, Angela Akofa; Not Applicable; Doctor of Philosophy; Department of Industrial Engineering; Not Applicable; Dr. Mehmet Gumus; Yes; Dr. Uday Venkatadri; Dr. Alexander Engau; Dr. Yana Fedortchouk; Dr. Ali Ulku
    The future of mining relies on solutions that are sustainable, efficient, and credible for the communities that live alongside mining operations. This research provides an integrated framework that addresses both ends of the spectrum: artisanal and small-scale mining (ASM), where immediate water conservation and basic treatment capabilities are crucial; large-scale mining operations, where decarbonized energy logistics within transparent governance structures are essential; as well as a unified decision analytics framework for conflict management in mining communities. This research addresses these gaps through four interrelated studies.. First, the review of sustainable operational efficiency (SOE) in the mining industry (MI). Second, the establishment of multi-stage water treatment networks that meet regulatory standards for reuse. Third, the development of transport modes for natural gas (NG) delivery to decrease the effects of traditional fuels. Finally, the implementation of commitment mechanisms that transform benefit-sharing and grievance processes from aspirations into realities. At the core of this framework are multi-objective optimization models that evaluates trade-offs aligned with the quadruple bottom line (QBL) of the sustainability framework. As well as decision analysis models for effectively translating stakeholder preferences into clear priorities through sensitivity assessments. Together, this dissertation contributes a cohesive field-ready methodology for designing mining systems for both ASM and large scale mining that preserve waterbodies, minimize emissions, and secure license to operate by aligning engineering design with responsible community-informed governance.
  • Item type: Item , Access status: Open Access ,
    Linking foraging behaviour of female grey seals (Halichoerus grypus) to population size, diet, and reproductive success
    (2025-12-10) Henry-Adams, Max; Not Applicable; Master of Science; Department of Biology; Received; Xavier Bordeleau; Not Applicable; Dr. Nell Den Heyer; Dr. Robert Lennox; Dr. Sara Iverson; Dr. Don Bowen
    Understanding the movement patterns and foraging behaviour that marine predators use to navigate and adapt to patchy and unpredictable prey availability has important implications for individual fitness. In capital breeding animals where the acquisition and storage of energy is critical to financing the costs of reproduction, these species offer interesting model systems to study the relationships between foraging behaviour, diet, and reproductive success. Northwest Atlantic grey seal (Halichoerus grypus) females are wide-ranging, long-lived, capital breeders, that are known to undergo extensive pre-breeding period foraging trips where they gain body mass prior to expending roughly a third or more of parturition body mass to support a single precocial pup during a short lactation period. Using hidden Markov Model estimates of foraging behaviour derived from satellite telemetry data obtained between 1995 and 2018, this study first uses generalized linear models (GLMs) to explore the potential effects of age and population density-dependent regulation of foraging behaviour in a grey seal population experiencing decelerating population growth. Results from 62 female deployments suggest that foraging behaviour leading up to the breeding season is impacted by increasing population size, likely because of intraspecific competition for access to the main prey during this foraging period. Although the impact of female age was less than population size, I found a dome-shaped pattern where the total distance travelled and the number of foraging locations used peaked in prime-age females. To test whether these differences in foraging tactics relate to variation in female diet, GLMs were then used to relate estimates of foraging behaviour from 50 deployments to diet estimated at instrument recovery through stable isotope (SI) signatures and quantitative fatty acid signature analysis (QFASA). The total distance travelled was linked to the spatial distribution of the primary prey species found in the diet, where individuals with more redfish (Sebastes sp.) in the diet travelled greater distances than females with sand lance (Ammodytes dubius)/mixed diets. Females that travelled greater distances and spent more time in area-restricted search also consumed relatively energy dense prey species. Finally, foraging behaviour and diet estimates were related to maternal postpartum mass (MPPM) and pup weaning mass to evaluate the fitness consequences associated with different foraging tactics and diets. The foraging tactics used appeared to depend on female size, with those estimated to be using more benthic foraging tactics having a higher MPPM. At the same time, diet diversity was negatively related to MPPM, but energy density was not, suggesting that larger females may have been foraging more efficiently or using a “quantity over quality” foraging tactic, although future studies using simultaneous estimates of the quantity of prey consumed would be required to confirm this hypothesis. Regardless, there was no detectable relationship between estimated foraging effort and diet metrics and pup weaning mass once MPPM and maternal age were included in models. With these data collected on an increasing grey seal population, evidence suggests that the environmental conditions and prey abundance were sufficient to support multiple foraging tactics leading to high population-level reproductive rates. This work demonstrates that individual variation in foraging behavior and diet contributes to individual fitness and population dynamics in free-ranging marine predators.