A technical paper titled 'FVEval: Understanding Language Model Capabilities in Formal Verification of Digital Hardware' has been published by researchers at UC Berkeley and NVIDIA.
The paper introduces FVEval, a benchmark and evaluation framework for characterizing the performance of large language models (LLMs) in formal verification (FV) tasks related to digital chip design.
FVEval consists of three sub-tasks that measure LLM capabilities at different levels, including generating SystemVerilog assertions, reasoning about design RTL, and suggesting assertions without human input.
The paper evaluates a wide range of existing LLMs against FVEval, providing insights into the current state of LLM performance and its potential application in improving digital FV productivity.