Large language models (LLMs) can exhibit biases but may output less biased answers in a multi-turn conversation when observing prior answers to the same question.
Researchers tested LLMs on a set of questions in different categories and found that they can 'de-bias' themselves in response to questions seeking random, unbiased answers.
A new metric called B-score has been proposed to detect biases in answers to subjective, random, easy, and hard questions, improving verification accuracy of LLM answers.
The B-score metric showed significant improvements in verifying LLM answers compared to using verbalized confidence scores or single-turn answers alone.