Compressing information helps AI work better.Sequence models with effective compression of context are more efficient.Efficiency and effectiveness tradeoff of sequence models is characterized by how well they compress their state.A fundamental principle for building sequence models is selectivity.