<ul><li>OpenAI has released HealthBench, a tool for evaluating AI performance in health-related tasks using real, complex cases.</li><li>Hospitals can use HealthBench to validate AI models for patient triage and clinical workflows by comparing their performance against physician-grade answers.</li><li>HealthBench helps hospitals avoid adopting models fine-tuned on limited datasets or optimized for marketing demos, ensuring safety and generalizability.</li><li>Incorporating HealthBench into governance workflows allows hospitals to assess AI model quality, set minimum acceptable scores, and increase accountability in AI adoption.</li></ul>

Benchmarking AI in Hospitals: How HealthBench Can Actually Help

Discover more