Reinforcement learning (RL) based fine-tuning is important for post-training language models for advanced mathematical reasoning and coding.RL fine-tuning consistently improves performance, even in smaller-scale models, but the underlying mechanisms are not well-understood.RL fine-tuning amplifies patterns in the pretraining data and converges towards a dominant output distribution.RL post-training on simpler questions can lead to performance gains on harder ones, indicating generalization of reasoning capabilities.