Fine-tuning text-to-image diffusion models with human feedback is an effective method for aligning model behavior with human intentions.
A novel automated data filtering algorithm called FiFA is proposed to enhance the fine-tuning of diffusion models using human feedback datasets with preference optimization.
FiFA selects data based on preference margin, text quality, and text diversity, ensuring informative samples and prevention of harmful content.
Experimental results show that FiFA significantly improves training stability and achieves better performance with reduced data usage.