Diffusion models have shown excellent performance in generative modeling but suffer from slow sampling speed.
Discrete diffusion models face challenges in capturing dependencies between elements due to the computational cost of processing high-dimensional joint distributions.
The proposed method introduces 'mixture' models for discrete diffusion that can capture dimensional correlations while being scalable.
Experimental results demonstrate the effectiveness of the method in distilling pretrained discrete diffusion models in image and language domains.