menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

VeriContam...
source image

Arxiv

1d

read

117

img
dot

Image Credit: Arxiv

VeriContaminated: Assessing LLM-Driven Verilog Coding for Data Contamination

  • Concerns about data contamination in LLM-driven Verilog coding raise questions about evaluation validity and industrial adoption.
  • Limited attention has been given to risks of data contamination in hardware coding using LLMs.
  • First-time analysis of Verilog code generation evaluation frameworks (VerilogEval and RTLLM) for contamination detection using CCD and Min-K% Prob methods.
  • Study covers evaluation of commercial and open-source LLMs (CodeGen2.5, Minitron 4b, Mistral 7b, phi-4 mini, LLaMA-{1,2,3.1}, GPT-{2,3.5,4o}, Deepseek-Coder, and CodeQwen 1.5), in baseline and fine-tuned models (RTLCoder and Verigen).
  • Findings confirm data contamination as a critical concern in Verilog code generation.
  • Analysis explores mitigations and trade-offs between code quality and fairness, aiming for unbiased benchmarking.

Read Full Article

like

7 Likes

For uninterrupted reading, download the app