A new benchmark called SOSBench has been introduced to assess the safety alignment of large language models (LLMs) in handling high-risk scientific domains.
SOSBench consists of 3,000 prompts derived from real-world regulations and laws in six hazardous scientific fields: chemistry, biology, medicine, pharmacology, physics, and psychology.
Evaluation of advanced models using SOSBench revealed alarming rates of harmful responses, indicating deficiencies in safety alignment and raising concerns about the responsible deployment of powerful LLMs.