Nation-states and sophisticated threat actors can compromise cybersecurity by poisoning widely used open-source AI models, creating blind spots in critical infrastructure and government agencies.
Two categories of attacks on AI models are pre-training attacks involving embedding malicious data during training and post-training attacks executed after the model is deployed.
Examples of pre-training attacks include Model Poisoning, where tainted data trains the AI to ignore certain attack signatures, and Adversarial Evasion, where crafted inputs exploit model vulnerabilities.
Post-training attacks involve reverse-engineering detection capabilities and exploiting predictable blind spots in the model, highlighting the importance of rigorous adversarial testing before integrating open-source AI into security solutions.