Google DeepMind Debuts Watermarks for AI-Generated Text

Google DeepMind Debuts Watermarks for AI-Generated Text

The chatbot revolution has left our world awash in AI-generated text: It has infiltrated our news feeds, term papers, and inboxes. It’s so absurdly abundant that industries have sprung up to provide moves and countermoves. Some companies offer services to identify AI-generated text by analyzing the material, while others say their tools will “humanize“ your AI-generated text and make it undetectable. Both types of tools have questionable performance, and as chatbots get better and better, it will only get more difficult to tell whether words were strung together by a human or an algorithm.

Here’s another approach: Adding some sort of watermark or content credential to text from the start, which lets people easily check whether the text was AI-generated. New research from Google DeepMind, described today in the journal Nature, offers a way to do just that. The system, called SynthID-Text, doesn’t compromise “the quality, accuracy, creativity, or speed of the text generation,” says Pushmeet Kohli, vice president of research at Google DeepMind and a coauthor of the paper. But the researchers acknowledge that their system is far from foolproof, and isn’t yet available to everyone—it’s more of a demonstration than a scalable solution.

Google has already integrated this new watermarking system into its Gemini chatbot, the company announced today. It has also open-sourced the tool and made it available to developers and businesses, allowing them to use the tool to determine whether text outputs have come from their own large language models (LLMs), the AI systems that power chatbots. However, only Google and those developers currently have access to the detector that checks for the watermark. As Kohli says: “While SynthID isn’t a silver bullet for identifying AI-generated content, it is an important building block for developing more reliable AI identification tools.”

The Rise of Content Credentials

Content credentials have been a hot topic for images and video, and have been viewed as one way to combat the rise of deepfakes. Tech companies and major media outlets have joined together in an initiative called C2PA, which has worked out a system for attaching encrypted metadata to image and video files indicating if they’re real or AI-generated. But text is a much harder problem, since text can so easily be altered to obscure or eliminate a watermark. While SynthID-Text isn’t the first attempt at creating a watermarking system for text, it is the first one to be tested on 20 million prompts.

Outside experts working on content credentials see the DeepMind research as a good step. It “holds promise for improving the use of durable content credentials from C2PA for documents and raw text,” says Andrew Jenks, Microsoft’s director of media provenance and executive chair of the C2PA. “This is a tough problem to solve, and it is nice to see some progress being made,” says Bruce MacCormack, a member of the C2PA steering…

Read full article: Google DeepMind Debuts Watermarks for AI-Generated Text

The post “Google DeepMind Debuts Watermarks for AI-Generated Text” by Eliza Strickland was published on 10/23/2024 by spectrum.ieee.org