Large Language Models (LLMs) have advanced significantly in recent years due to their large-scale architectures and extensive training on massive datasets.
Machine unlearning algorithms have been developed to address concerns about data privacy and ownership by removing specific knowledge from models without costly retraining.
Evaluating the efficacy of unlearning algorithms for LLMs is challenging due to their complexity and generative nature.
A comprehensive auditing framework has been introduced in this work, including benchmark datasets, unlearning algorithms, and auditing methods to evaluate the effectiveness and robustness of different unlearning strategies.