Codehacks is a dataset of programming problems obtained from the Codeforces online judge platform.The dataset includes 288,617 error-inducing test cases referred to as 'hacks' for 5,578 programming problems.Each problem in the dataset is accompanied by a natural language description and the source code for 2,196 submitted solutions.The dataset aims to support data-driven creation of test suites, particularly for testing software synthesized from large language models.