How to Make Sure a Voice Is AI-Generated? It’s Not Simple But There’s a Way

Clara Alex
2 min readMay 24, 2024

--

Photo by Jason Rosewell on Unsplash

AI-assisted or entirely generated content is flooding over the Internet, and content-sharing and streaming platforms are the ones that suffer most. Given that the combat is predetermined and most probably doomed, the platforms want at least know how to control it. YouTube has recently rolled out an update aimed at labelling AI-videos, while OpenAI’s Sora will use watermarking to help us distinguish between the real and the fake.

Tools that can identify if an image is AI-generated also exist now, but what about audio and voices? Can we be 99% sure that a song we’re listening to on Spotify is actually sung by a human? The answer is yes, and Pex is one of the companies that can do just that.

🖐️ On point: Allegedly AI-Generated Metal Music Is Spotted on Spotify

But what’s the issue if a song is sung by AI — who cares? Well, voice cloning and swapping raise questions of attribution and compensation for the original creators.

AI-generated voices have penetrated the music industry for good. The fake Drake, who is actually an anonymous TikToker who made a fake The Weeknd and Drake song last year, has shaken up the industry and inspired thousands of copycats and AI startups along the way. Some artists lend their voices to be cloned voluntarily, though. Grimes is one of those — the singer created an entire platform that allows her fans to use her voice, make their own songs with it, and even get paid for that! Other artists who weren’t so all-in collaborate with AI companies and let them use their voices to train AI models, thus contributing to a more ethical use of the technology.

But all this didn’t go without consequences. The proliferation of AI voices mimicking famous singers has sparked a necessity for reliable methods to identify such content and ensure proper licensing and credit. Watermarking and artefact detection are two of those. Both have their weaknesses, however.

Watermarking, while effective in theory, can be circumvented through removal or modification attempts, rendering it unreliable. So does artifact detection — although initially promising — it struggles to keep pace with the rapid advancements in AI technology.

Pex offers a better option — ACR (Automated Content Recognition) technology that compares the unique qualities of one content piece with another and detects a match (if any). Unlike watermarking, this new approach examines entire files and checks if there’s any overlapping in place to pinpoint AI-made content.

🍿⚡️ Read more at Kill the DJ

--

--

Clara Alex

Managing Editor at Kill the DJ. Content strategist in audio tech companies. Write about music, AI in audio, podcasting, and all things audio.