Adversarial patch attacks are a significant threat to vision systems, involving perturbations that deceive deep models.
Traditional defense methods often necessitate retraining or fine-tuning, making them unsuitable for real-world deployment.
A new training-free Visual Retrieval-Augmented Generation (VRAG) framework is proposed for adversarial patch detection, integrating Vision-Language Models (VLMs).
VRAG leverages generative reasoning by retrieving visually similar patches and images to identify diverse attack types without additional training.
Various large-scale VLMs, such as Qwen-VL-Plus, Qwen2.5-VL-72B, and UI-TARS-72B-DPO, are evaluated, with UI-TARS-72B-DPO achieving a state-of-the-art 95 percent classification accuracy for open-source adversarial patch detection.
The closed-source Gemini-2.0 model achieves the highest overall accuracy of 98 percent.
Experimental results showcase VRAG's efficacy in detecting various adversarial patches with minimal human annotation, offering a promising defense against evolving attacks.