Researchers propose the SUV framework to prevent Large Language Models (LLMs) from memorizing copyrighted content while preserving its overall utility.
SUV constructs a dataset capturing instances of copyrighted infringement cases and unlearns the content from LLMs using Direct Preference Optimization (DPO).
To mitigate the degradation in LLMs' performance on unrelated tasks, SUV integrates gradient projection and Fisher information regularization.
Experiments on a large-scale dataset of 500 copyrighted books demonstrate the scalability and efficacy of SUV in reducing verbatim memorization without significant impact on unrelated tasks.