Despite their impressive capabilities, LLMs exhibit a basic generalization failure known as the Reversal Curse.
The Reversal Curse in LLMs is attributed to the long-standing binding problem in cognitive science, neuroscience, and AI.
Transformers' limitations in conceptual binding cause the inconsistency and entanglements of concept representations, leading to the Reversal Curse.
A model design based on JEPA (Joint-Embedding Predictive Architecture) breaks the Reversal Curse and improves generalization by incorporating memory layers supporting disentangled concept representations.