IBM and Hugging Face Researchers have released SmolDocling, a 256M open-source vision language model (VLM) for document OCR.SmolDocling provides a streamlined solution for end-to-end multi-modal document conversion tasks, processing entire pages through a single model.It utilizes a universal markup format called DocTags to capture page elements and structures, and achieves high performance in benchmark tests.SmolDocling is capable of handling diverse elements within documents and offers comprehensive structured metadata for enhanced usability.