menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

HypoEval: ...
source image

Arxiv

1w

read

54

img
dot

Image Credit: Arxiv

HypoEval: Hypothesis-Guided Evaluation for Natural Language Generation

  • Large language models (LLMs) have demonstrated great potential for automating the evaluation of natural language generation.
  • Previous frameworks of LLM-as-a-judge fall short in two ways: they either use zero-shot setting without consulting any human input, which leads to low alignment, or fine-tune LLMs on labeled data, which requires a non-trivial number of samples.
  • In this paper, the authors propose HypoEval, a Hypothesis-guided Evaluation framework, which incorporates a checklist-like approach to combine LLM's assigned scores on each decomposed dimension to acquire overall scores.
  • With only 30 human evaluations, HypoEval achieves state-of-the-art performance in alignment with both human rankings and human scores, outperforming previous methods.

Read Full Article

like

3 Likes

For uninterrupted reading, download the app