Google unveils invisible ‘watermark’ for AI-generated text

The watermark was applied to 20 million text responses generated by Google’s Gemini large language model.Credit: Jaap Arriens/NurPhoto via GettyResearchers at Google DeepMind in London have devised a ‘watermark’ to invisibly label text that is generated by artificial intelligence (AI) — and deployed it to millions of chatbot users.The watermark, reported in Nature on 23 October1, is not the first to be made for AI-generated text. Nor is it able to withstand determined attempts to remove it. But it seems to be the first at-scale, real-world demonstration of a text watermark. “To my mind, by far the most important news here is just that they’re actually deploying this,” says Scott Aaronson, a computer scientist at the University of Texas at Austin, who until August worked on watermarks at OpenAI, the creators of ChatGPT based in San Francisco, California.AI models fed AI-generated data quickly spew nonsenseSpotting AI-written text is gaining importance as a potential solution to the problems of fake news and academic cheating, as well as a way to avoid degrading future models by training them on AI-made content.In a massive trial, users of Google’s Gemini large language model (LLM), across 20 million responses, rated watermarked texts as being of equal quality to unwatermarked ones. “I am excited to see Google taking this step for the tech community,” says Furong Huang, a computer scientist at the University of Maryland in College Park. “It seems likely that most commercial tools will be watermarked in the near future,” says Zakhar Shumaylov, a computer scientist at the University of Cambridge, UK.Choice of wordsIt is harder to apply a watermark to text than to images, because word choice is essentially the only variable that can be altered. DeepMind’s watermark — called SynthID-Text — alters which words the model selects in a secret, but formulaic way that can be detected with a cryptographic key. Compared with other approaches, DeepMind’s watermark is marginally easier to detect, and applying it does not slow down text generation. “It seems to outperform schemes of the competitors for watermarking LLMs,” says Shumaylov, who is a former collaborator and brother of one of the study’s authors.Three ways ChatGPT helps me in my academic writingThe tool has also been made open, so developers can apply their own such watermark to their models. “We would hope that other AI-model developers pick this up and integrate it with their own systems,” says Pushmeet Kohli, a computer scientist at DeepMind. Google is keeping its own key secret, so users won’t be able to use detection tools to spot Gemini-watermarked text.Governments are betting on watermarking as a solution to the proliferation of AI-generated text. Yet, problems abound, including getting developers to commit to using watermarks, and to coordinate their approaches. And earlier this year, researchers at the Swiss Federal Institute of Technology in Zurich showed that any watermark is vulnerable to being removed, called ‘scrubbing’, or to being ‘spoofed’, the process of applying watermarks to text to give the false impression that it is AI-generated.Token tournamentDeepMind’s approach builds on an existing method that incorporates a watermark into a sampling algorithm, a step in text generation that is separate from the LLM itself.An LLM is a network of associations built up by training on billions of words or word-parts, known as tokens. When given a string of text, the model assigns to each token in its vocabulary a probability of being next in the sentence. The sampling algorithm’s job is to select, from this distribution, which token to use, according to a set of rules.The SynthID-Text sampling algorithm uses a cryptographic key to assign random scores to each possible token. Candidate tokens are pulled from the distribution, in numbers proportional to their probability, and placed in a ‘tournament’. There, the algorithm compares scores in a series of one-on-one knockouts, with the highest value winning, until there is only one token standing, which is selected for use in the text.Rise of ChatGPT and other tools raises major questions for researchThis elaborate scheme makes it easier to detect the watermark, which involves running the same cryptographic code on generated text to look for the high scores that are indicative of ‘winning’ tokens. It might also make it more difficult to remove.The multiple rounds in the tournament can be likened to a combination lock, in which each round represents a different digit that must be solved to unlock or remove the watermark, says Huang. “This mechanism makes it significantly more challenging to scrub, spoof or reverse-engineer the watermark,” she adds. With text containing around 200 tokens, the authors showed that they could still detect the watermark, even when a second LLM was used to paraphrase the text. For shorter strings of text, the watermark is less robust.The researchers did not explore how well the watermark can resist deliberate removal attempts. The resilience of watermarks to such attacks is a “massive policy question”, says Yves-Alexandre de Montjoye, a computer scientist at Imperial College London. “In the context of AI safety, it’s unclear the extent to which this is providing protection,” he says.Kohli hopes that the watermark will start by being helpful for well-intentioned LLM use. “The guiding philosophy was that we want to build a tool that can be improved by the community,” he says.

Hot Topics

Related Articles