<ul><li>Large language models (LLMs) have the potential to produce malicious code, posing risks to assessment infrastructure.</li><li>SandboxEval is a test suite designed to evaluate the security and confidentiality of test environments for LLM-generated code.</li><li>The suite focuses on vulnerabilities related to sensitive information exposure, filesystem manipulation, external communication, and other dangerous operations.</li><li>By deploying SandboxEval, developers can gain valuable insights to improve assessment infrastructure and identify risks associated with LLM execution.</li></ul>

SandboxEval: Towards Securing Test Environment for Untrusted Code

Discover more