JavelinGuard is a suite of low-cost, high-performance model architectures designed for detecting malicious intent in Large Language Model (LLM) interactions.
The suite includes five progressively sophisticated transformer-based architectures named Sharanga, Mahendra, Vaishnava, Ashwina, and Raudra, each offering unique trade-offs in speed, interpretability, and resource requirements.
The models are rigorously benchmarked across diverse adversarial datasets, demonstrating superiority over leading open-source guardrail models and large decoder-only LLMs like gpt-4o in terms of accuracy and latency.
Raudra's multi-task design is highlighted for offering the most robust performance overall, providing guidance to practitioners for selecting the optimal balance of complexity and efficiency in real-world LLM security applications.