<ul data-eligibleForWebStory="true"><li>Masked Diffusion Models (MDMs) are powerful tools for generating discrete data by gradually unmasking tokens over time, but inefficiencies result in wasted computation due to unchanged sequences in many steps.</li><li>Recent enhancements in MDMs include refining training objectives, blending autoregressive methods, guiding sampling with energy-based models, and introducing Prime, a method that allows tokens to assume intermediate states by masking sub-parts of their encoded form.</li><li>MDM-Prime, an enhanced model utilizing the Prime method, achieves lower perplexity on text and competitive FID scores on image tasks, outperforming previous MDMs and autoregressive models.</li><li>MDM-Prime's architecture involves sub-token level partial masking, enabling smoother intermediate state generation, improved model efficiency, and stronger performance in text and image generation tasks.</li></ul>

MDM-Prime: A generalized Masked Diffusion Models (MDMs) Framework that Enables Partially Unmasked Tokens during Sampling

Discover more