Large language models (LLMs) can generate fluent but factually incorrect outputs, known as hallucinations, affecting reliability in real-world applications.
A framework decomposing LLM uncertainty into four distinct sources is presented in this paper.
A source-specific estimation pipeline is developed to quantify these uncertainty types across tasks and models.
Experiments show that the proposed uncertainty-aware selection strategy consistently outperforms baseline strategies in selecting appropriate models or uncertainty metrics for more reliable deployment.