Classifier-free guidance for LLM text generation using conditional and unconditional score estimates has been developed as a simpler version of classifier guidance.
CFG is used to update predicted scores of generated LLM text in a direction of some predefined class without applying gradient-based updates.
An alternative implementation approach to CFG for LLM text generation without severe degradation of generated sample quality has been suggested.
The original CFG approach may cause unexpected artefacts and degradation of LLM text quality, but the artefacts depend on multiple factors such as the model and prompts.
The suggested alternative implementation has been shown to prevent the degradation of generated LLM sample quality in both manual and automatic tests.
Examples of artefacts and degradation of generated LLM sample quality have been demonstrated through tests for different CFG coefficients on a GPT2 model.
The problem arises from the logarithm component in the original CFG implementation, which treats probabilities unequally and can cause low-probability tokens to receive high scores after applying CFG.
The suggested alternative implementation removes the logarithm component and aligns the text-CFG with diffusion-models CFG that only operate with model predicted scores without gradients.
The suggested alternative implementation introduces minimal changes to the HuggingFace Transformers' UnbatchedClassifierFreeGuidanceLogitsProcessor function.
The suggested alternative implementation has improved text quality in manual tests and has not deteriorated performance on automatic tests compared to the original implementation.