<ul data-eligibleForWebStory="true"><li>DynamicBench is a new benchmark designed to evaluate the ability of large language models (LLMs) to store and process up-to-the-minute data for real-time information processing in applications.</li><li>The benchmark uses a dual-path retrieval pipeline combining web searches and local report databases, requiring domain-specific knowledge for accurate responses within specialized fields.</li><li>DynamicBench assesses LLMs in scenarios with or without external documents, measuring their capacity to autonomously process recent information or utilize contextual enhancements.</li><li>Experimental results show DynamicBench outperforming GPT4o by 7.0% in document-free scenarios and 5.8% in document-assisted scenarios, with a new report generation system managing dynamic information synthesis effectively.</li></ul>

DynamicBench: Evaluating Real-Time Report Generation in Large Language Models

Discover more