Repository logo
 

TRAINING AND EVALUATING THE USE OF LARGE LANGUAGE MODELS (LLMS) IN THE DOMAIN OF CANADIAN NUCLEAR INDUSTRY

dc.contributor.authorAnwar, Muhammad Saleh
dc.contributor.copyright-releaseNot Applicable
dc.contributor.degreeMaster of Science
dc.contributor.departmentDepartment of Engineering Mathematics & Internetworking
dc.contributor.ethics-approvalNot Applicable
dc.contributor.external-examinerN/A
dc.contributor.manuscriptsNot Applicable
dc.contributor.thesis-readerDr. Guy Kember
dc.contributor.thesis-readerDr. Kamal El-Sankary
dc.contributor.thesis-supervisorDr. Issam Hammad
dc.date.accessioned2025-07-14T14:37:20Z
dc.date.available2025-07-14T14:37:20Z
dc.date.defence2025-06-27
dc.date.issued2025-07-10
dc.description.abstractThis thesis addresses the challenges of accuracy, reliability, data privacy, and resource constraints in applying Large Language Models (LLMs) to the Canadian nuclear industry. It presents a multi-faceted approach by evaluating existing models, developing synthetic data generation techniques, and training a secure, domain-specific LLM from scratch. The research first demonstrates that while general-purpose LLMs are prone to factual inaccuracies on nuclear-specific topics, their reliability is significantly improved by integrating a Retrieval-Augmented Generation (RAG) framework. This approach enhances factual accuracy by grounding responses in verified, domain-specific documents. To overcome data scarcity and confidentiality barriers, the thesis pioneers a methodology for generating synthetic, structured question-and-answer pairs from unstructured nuclear texts using LLMs. This scalable and privacy-preserving approach creates valuable, model-ready datasets for training and evaluation without exposing sensitive information. Furthermore, the work validates the feasibility of developing a secure, private LLM from scratch. By training a compact model on a single GPU using the "Essential CANDU" textbook, it demonstrates a practical path for creating in-house models that mitigate cybersecurity risks and can learn specialized terminology within a resource-constrained and secure environment. Collectively, this research provides a comprehensive framework for integrating LLM technology safely and effectively into the nuclear industry, establishing a foundation for advanced AI tools that enhance knowledge management and operational support.
dc.identifier.urihttps://hdl.handle.net/10222/85209
dc.language.isoen
dc.subjectLARGE LANGUAGE MODELS
dc.subjectArtificial Intelligence
dc.subjectNuclear Power
dc.subjectGenerative AI
dc.titleTRAINING AND EVALUATING THE USE OF LARGE LANGUAGE MODELS (LLMS) IN THE DOMAIN OF CANADIAN NUCLEAR INDUSTRY

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
MuhammadAnwar2025.pdf
Size:
2.09 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.12 KB
Format:
Item-specific license agreed upon to submission
Description: