Large Language Models (LLMs) have advanced NLP tasks, but hallucinations remain a challenge in critical domains like healthcare and legal settings.
Retrieval-Augmented Generation (RAG) aims to reduce hallucinations by grounding LLM responses in retrieved documents.
LettuceDetect, using ModernBERT, detects hallucinations in RAG applications efficiently and outperforms older BERT-based models.
RAGTruth is a benchmark for evaluating hallucination detection in RAG settings, providing annotated examples and spans.
LettuceDetect utilizes token-level classification for hallucination detection, achieving competitive performance with lower computational costs.
The models are trained on the RAGTruth dataset and perform inference by detecting hallucinations at the token and span levels.
LettuceDetect demonstrates strong performance in hallucination detection, surpassing other models and achieving state-of-the-art span-level results.
The models are efficient, processing 30-60 examples per second on a single NVIDIA A100 GPU, suitable for real-time and resource-constrained environments.
Overall, LettuceDetect offers accurate hallucination detection with lean, purpose-built encoder-based models for RAG systems.
The framework provides a foundation for future research in expanding to new datasets, languages, and exploring advanced architectures.