Training strong coding models with RL has been hindered by the lack of high-quality datasets.The DeepCoder team created a new training set of 24,000 validated problems by combining hand-curated problems.Existing resources such as APPS, TACO, CodeContests, KodCode, and LeetCode are not suitable for RL benchmarks.A custom filtering pipeline was used to ensure the quality of each problem in the dataset.