TRAINING AND EVALUATING THE USE OF LARGE LANGUAGE MODELS (LLMS) IN THE DOMAIN OF CANADIAN NUCLEAR INDUSTRY

Anwar, Muhammad Saleh

TRAINING AND EVALUATING THE USE OF LARGE LANGUAGE MODELS (LLMS) IN THE DOMAIN OF CANADIAN NUCLEAR INDUSTRY

dc.contributor.author	Anwar, Muhammad Saleh
dc.contributor.copyright-release	Not Applicable
dc.contributor.degree	Master of Science
dc.contributor.department	Department of Engineering Mathematics & Internetworking
dc.contributor.ethics-approval	Not Applicable
dc.contributor.external-examiner	N/A
dc.contributor.manuscripts	Not Applicable
dc.contributor.thesis-reader	Dr. Guy Kember
dc.contributor.thesis-reader	Dr. Kamal El-Sankary
dc.contributor.thesis-supervisor	Dr. Issam Hammad
dc.date.accessioned	2025-07-14T14:37:20Z
dc.date.available	2025-07-14T14:37:20Z
dc.date.defence	2025-06-27
dc.date.issued	2025-07-10
dc.description.abstract	This thesis addresses the challenges of accuracy, reliability, data privacy, and resource constraints in applying Large Language Models (LLMs) to the Canadian nuclear industry. It presents a multi-faceted approach by evaluating existing models, developing synthetic data generation techniques, and training a secure, domain-specific LLM from scratch. The research first demonstrates that while general-purpose LLMs are prone to factual inaccuracies on nuclear-specific topics, their reliability is significantly improved by integrating a Retrieval-Augmented Generation (RAG) framework. This approach enhances factual accuracy by grounding responses in verified, domain-specific documents. To overcome data scarcity and confidentiality barriers, the thesis pioneers a methodology for generating synthetic, structured question-and-answer pairs from unstructured nuclear texts using LLMs. This scalable and privacy-preserving approach creates valuable, model-ready datasets for training and evaluation without exposing sensitive information. Furthermore, the work validates the feasibility of developing a secure, private LLM from scratch. By training a compact model on a single GPU using the "Essential CANDU" textbook, it demonstrates a practical path for creating in-house models that mitigate cybersecurity risks and can learn specialized terminology within a resource-constrained and secure environment. Collectively, this research provides a comprehensive framework for integrating LLM technology safely and effectively into the nuclear industry, establishing a foundation for advanced AI tools that enhance knowledge management and operational support.
dc.identifier.uri	https://hdl.handle.net/10222/85209
dc.language.iso	en
dc.subject	LARGE LANGUAGE MODELS
dc.subject	Artificial Intelligence
dc.subject	Nuclear Power
dc.subject	Generative AI
dc.title	TRAINING AND EVALUATING THE USE OF LARGE LANGUAGE MODELS (LLMS) IN THE DOMAIN OF CANADIAN NUCLEAR INDUSTRY

Files

Original bundle

Now showing 1 - 1 of 1

Name:: MuhammadAnwar2025.pdf
Size:: 2.09 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.12 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Faculty of Graduate Studies Online Theses