Masked diffusion models (MDM) are generative models for discrete data that use a partial masking scheme to prevent redundant computation.
The proposed method, Partial masking scheme (Prime), allows tokens to have intermediate states between masked and unmasked, improving the model's efficiency.
This approach enables the model to make predictions based on partially observed token information and enhances the denoising process.
The method shows superior performance on generative modeling tasks, achieving lower perplexity on text data and competitive FID scores on image data.