Recent advances in generative models have made it hard to differentiate between real and synthetic data.
Self-consuming loops in training with synthetic data can lead to model collapse or instability.
Data curation based on user preferences can drive models to optimize those preferences, leading to converging distributions.
Study explores the impact of noisy and adversarially curated data on generative models, proposes attack algorithms for adversarial scenarios, and conducts experiments to demonstrate algorithm effectiveness.