Transformers have theoretical limitations in modeling certain sequence-to-sequence tasks.It is unclear if these limitations affect large-scale pretrained Language Models (LLMs).Pretraining enhances some Transformer capabilities but does not overcome length-generalization limits.Empirical observations show an asymmetry in retrieval tasks, favoring induction over anti-induction.