Uniform-state discrete diffusion models, while promising for fast text generation, often lag behind autoregressive and masked diffusion models.
A new method called Duo aims to narrow this performance gap by leveraging insights from Gaussian diffusion processes.
Duo incorporates curriculum learning, guided by Gaussian processes, to improve training speed and reduce variance.
Models trained with this approach outperform autoregressive models in zero-shot perplexity on multiple benchmarks.
The method also introduces Discrete Consistency Distillation to enhance few-step generation in diffusion language models by significantly accelerating sampling.
Code and model checkpoints for Duo are available on the project page: http://s-sahoo.github.io/duo