<ul data-eligibleForWebStory="true">Diffusion models have advanced in image quality aligned with textual prompts, raising safety concerns.A unique jailbreak method prompts T2I models to create unsafe content when combining images with safe texts.A dataset was created to test diffusion-based text-to-image (T2I) models under this jailbreak.Nine T2I models, including commercial ones, were evaluated, showing a tendency to produce unsafe content.Results indicate rates of unsafe generation varying from 10% to 70%, with DALLE 3 being notably unsafe.Common filters like keyword blocklists and NSFW image filters were ineffective against this jailbreak.Filters designed for single modality detection failed to prevent unsafe content generation.The study delves into the text rendering capability and training data as reasons for such jailbreaks.The research sets a basis for enhancing security and reliability of T2I models.Project page available at https://multimodalpragmatic.github.io/