<ul data-eligibleForWebStory="true">Modern language models are under scrutiny for their memorization behavior, questioning if they memorize training data meaningfully.Existing techniques like data extraction and privacy mechanisms struggle to differentiate between memorization and generalization.Researchers propose a novel method to measure model capacity by separating memorization into unintended and generalization components.They found GPT language models have about 3.6 bits-per-parameter capacity and developed scaling laws for membership inference.Experiments involved training GPT-2 models with various configurations and sizes on synthetic and real-text datasets.Insights include 3.5 to 3.6 bits per parameter, double descent phenomena, and precision impact on model storage capacity.The study disentangles memorization and generalization effects, showing increased unintended memorization with more parameters.Membership inference accuracy decreases with larger datasets, but scaling laws are consistent for models up to 1.5B parameters.The framework enhances understanding of how transformer models encode data and distinguishes between memorization and generalization.