Watermarking AI-generated content has become a popular method to identify language model outputs (LLMs) but it is flawed and essentially useless.
The goal of watermarking, often misunderstood, is to identify text from a specific model but it cannot distinguish reliably between AI-generated and human text.
Many companies have launched their own watermarking tools however, research from Carnegie Mellon University shows that watermarking cannot deal with AI-driven misinformation.
Watermarking raises several difficulties including compatibility concerns with temperature settings and a total lack of accuracy and is therefore, not enough to keep pace with AI or reduce its spread.
The problem with this approach is that only capable LLMs can be watermarked, and disingenuous actors will always access unmarked models, despite even the tiniest of hindrances in watermarked models.
Dominik Lukes, lead business technologist at the AI/ML support competency centre at the University of Oxford claimed “Outside a school exam, the use of an LLM is no longer a reliable indicator of fraud.”
Watermarking suffers from several restraining factors like robustness and detection difficulty, tempting bad actors to use open-source models for privacy, rendering API-based watermarking ineffective.
Furthermore, authors may edit AI-generated text created with the help of LLMs, making the necessary distinction between human-written and AI-written text even harder.
Even an LLM based on a model that has abolished the API will not be enough to avoid detection by AI-detection technology.
Trivially, if watermarking would miraculously work, it would not solve the problem as not all AI-generated text is harmful and AI-generated and human-written texts are often intertwined.