Large language models (LLMs) are often inconsistent and unreliable due to hallucinations and prompt perturbations.Different methods have been proposed to address LLM inconsistencies, with one approach being measuring the consistency of LLM responses.Existing methods for measuring LLM consistency may not align well with human perceptions.A new logit-based ensemble method has been proposed to estimate LLM consistency, showing promising results matching human evaluations.