Diffusion models have been successful in generating diverse images from text prompts.
A new method called Divergence Minimization Preference Optimization (DMPO) is introduced to align diffusion models with human preferences by minimizing reverse KL divergence.
Experiments show that models fine-tuned with DMPO outperform existing techniques, achieving at least 64.6% improvement in PickScore.
DMPO offers a principled approach for aligning generative behavior with desired outputs in diffusion models.