Practical Application of Large Language Models in the Nuclear Power Generation Industry
| dc.contributor.author | de Costa, Mishca | |
| dc.contributor.copyright-release | Not Applicable | |
| dc.contributor.degree | Master of Science | |
| dc.contributor.department | Department of Engineering Mathematics & Internetworking | |
| dc.contributor.ethics-approval | Not Applicable | |
| dc.contributor.external-examiner | n/a | |
| dc.contributor.manuscripts | Not Applicable | |
| dc.contributor.thesis-reader | Dr. Guy Kember | |
| dc.contributor.thesis-reader | Dr. Hamed Aly | |
| dc.contributor.thesis-supervisor | Dr. Issam Hammad | |
| dc.date.accessioned | 2025-12-09T15:26:37Z | |
| dc.date.available | 2025-12-09T15:26:37Z | |
| dc.date.defence | 2025-11-27 | |
| dc.date.issued | 2025-12-04 | |
| dc.description.abstract | Nuclear power organizations steward large, fragmented, and partially digitized textual corpora that impede knowledge access and complicate decision support. Effective application of large language models (LLMs) is further hindered by dense, site-specific Canada Deuterium Uranium (CANDU) jargon and multi-definition acronyms underrepresented in public training data. Existing applied nuclear LLM efforts emphasize isolated retrieval or proprietary model announcements, leaving gaps in integrated, auditable workflows for safety classification, terminology normalization, and structured outputs. This thesis contributes a deployment-first, prompt-centric framework comprising: (i) a GPT based Station Condition Record (SCR) safety event scoring approach emphasizing balanced recall/precision over an imbalanced corpus; (ii) an ensemble jargon detection and expansion pipeline combining LLM heuristics with deterministic and probabilistic methods; and (iii) structured output / function-calling patterns that enhance traceability and governance readiness when mining data from structured nuclear databases using hybrid NL-to-SQL techniques. Collectively, the results provide an evidence-based blueprint indicating when prompt engineering plus glossary normalization can defer costly domain pretraining, complementing parallel work on retrieval augmentation and secure local experimentation. | |
| dc.identifier.uri | https://hdl.handle.net/10222/85546 | |
| dc.language.iso | en | |
| dc.subject | nuclear | |
| dc.subject | llm | |
| dc.title | Practical Application of Large Language Models in the Nuclear Power Generation Industry |
