Models exhibit a U-shaped attention pattern, focusing heavily on the beginning and end of the context and ignoring the middle, known as the 'donut hole' problem.
Increasing context window may lead to attention and accuracy drop in the middle 70-80% of the prompt, resulting in wastage of tokens.
Context inflation reshapes attention allocation, impacting the model's ability to focus, emphasizing the importance of task-specific prompt strategies.
Placing critical context at the beginning and end of prompts can significantly influence the model's output and improve performance in long contexts.