A new study titled 'Graph-KV' introduces a method to inject structural biases into large language models.
The Graph-KV approach leverages the KV-cache of text segments to allow for interaction governed by structural inductive biases, improving tasks like retrieval-augmented generation.
By selectively attending only to designated source segments, Graph-KV induces a graph-structured block mask, sparsifying attention and enabling a message-passing-like step within the language model.
Evaluated across various benchmarks and tasks, Graph-KV outperforms baseline methods by effectively reducing positional bias and utilizing structural inductive biases.