The article discusses the complex issue of AI training on copyrighted material and the implications of such actions.The authors tested an AI model's familiarity with O'Reilly books to determine if unauthorized training occurred.They used a statistical measure, AUROC, to evaluate the model's access to pre-training data.Results show that newer AI models seem to have more knowledge of private content than public content.The article questions the ethics of training AI models on pirated content and advocating for respecting copyright laws.It highlights the importance of compensating authors and creators for their work in the AI content economy.There is a call for AI companies like OpenAI to track usage and pay royalties for using copyrighted material, similar to O'Reilly's practices.The article emphasizes the need for a sustainable AI ecosystem that respects creators' rights and incentivizes content creation.It suggests that AI companies should adopt practices that support copyright preservation and fair compensation for content creators.The article concludes by proposing a vision for AI models to engage in copyright conversations and negotiation for appropriate compensation.Overall, the article advocates for a copyright-aware approach in AI development to build a more ethical and sustainable content ecosystem.