Large language models (LLMs) often need to incorporate new knowledge not present in their pre-training data.
Retrieval-augmented generation (RAG) is the industry standard for knowledge injection, but fine-tuning has not achieved comparable success.
A new fine-tuning technique called prompt distillation is proposed to learn new knowledge and match the performance of RAG.
Prompt distillation involves generating question-answer pairs about the new knowledge and training a student model to mimic the output distributions of a teacher model that receives the new knowledge in its prompt.