<ul><li>Introduction of a domain-specific Large Language Model for nuclear applications, based on the Essential CANDU textbook, to protect sensitive data in nuclear operations.</li><li>Model uses a compact Transformer-based architecture, trained on a single GPU, showcasing understanding of specialized nuclear vocabulary with some limitations in syntactic coherence.</li><li>Focus on in-house LLM solutions for cybersecurity and data confidentiality standards, highlighting early successes in text generation and the need for improvements in dataset coverage and preprocessing.</li><li>Future directions include expanding the dataset to cover diverse nuclear subtopics, improving tokenization, and evaluating readiness of the model for practical use in the nuclear domain.</li></ul>

Towards Secure and Private Language Models for Nuclear Power Plants

Discover more