<ul><li>Sparse Autoencoders (SAEs) are explored as a lightweight, interpretable alternative for bug detection in Java functions to address software vulnerabilities such as buffer overflows and SQL injections.</li><li>SAEs are proposed as a solution to the challenges posed by the complexity and opacity of Large Language Models (LLMs) in vulnerability detection and secure code generation.</li><li>Evaluation shows that SAE-derived features enable bug detection with an F1 score of up to 89%, outperforming fine-tuned transformer encoder baselines.</li><li>This study provides empirical evidence that SAEs can detect software bugs directly from the internal representations of pretrained LLMs without requiring fine-tuning or task-specific supervision.</li></ul>

Are Sparse Autoencoders Useful for Java Function Bug Detection?

Discover more