Former President Barack Obama is one of the famous book narrators who earns high compensation for his work. AI programs seek to replace many narrators more cheaply with fake celebrity voices, despite objections by performer unions. (Source: Pixabay)

AI ‘FabFakes’ Begin to Make Audiobooks Publishers More Money than Humans has covered more than a handful of stories regarding the use of AI to deceive people. We wrote about an AI tech company in Asia that spent a few months recreating a digital copy of a deceased 9-year-old girl that it showed her mother. With an algorithm/ hologram, they temporarily brought the child back to life for a few minutes. The deception was astounding. And emotionally distressing.

We also shared pieces about deepfake technologies can bend and manipulate faces and exploit famous faces in videos in which they are depicted doing questionable—and often pornographic—things.

Now the use of AI in audiobooks is making headlines and money. This week reports how huge the podcast business has gotten in the U.S. The numbers are staggering:

“Sales in 2020 exceeded $1.3 billion, up 12% over 2019. The percentage of Americans 18 and older who have listened to an audiobook is now 46%, up from 44% in 2019. One thing hasn’t changed, however: the arduous production process for audiobooks.”

Yes, it takes a good deal of money to cover the time and talent of stars you recognize and would pay to hear them read their books, like former President Barack Obama.

However, the technology using AI to reproduce or outright create voices is a fraction of the cost and at this point, indistinguishable on some audio productions.

Author Thad Mcllroy has written a fun, and very detailed article that lists numerous AI audio companies that use different approaches to dominate that part of the media.

“There’s been some really good breakthroughs in text to speech becoming more humanlike over the last couple of years,” says Kane Simms, founder of VUX World. “You only need to listen to some of the celebrity Alexa voices, including Samuel L. Jackson, Shaquille O’Neal, and Melissa McCarthy.”

Human Talent under Pressure

Mcllroy mentions how dominant Audible Inc. is in the podcast or narration that is making great profits. A lot of that is due to recognizable names and voices. People are drawn to the stars. But the costs are lower and the product is of incredible quality.

ACX is the Audible audiobook self-publishing platform, and included among the ACX Audio Submission Requirements section of its website is the following warning, presented as guidance: “Your submitted audiobook must be narrated by a human. TTS recordings are not allowed. Audible listeners choose audiobooks for the performance of the material, as well as the story. To meet that expectation, your audiobook must be recorded by a human.”

With Audible controlling as much as 50% of the audiobook market (depending on the type of content), its current voice policy is a major concern for all of the companies looking to get into the narration field.

But if AI does the voice, then a cascade of legal issues can follow. The professionals in this business are represented by a powerful union, SAG-AFTRA (Screen Actors Guild–American Federation of Television and Radio Artists), which describes itself as “the world’s largest labor union representing performers and broadcasters.” The organization offers the full slate of union perks: training, guaranteed minimum rates for recordings, and health and life insurance. Artificial voices require none of these benefits.

The article has a long list of companies and how their tech works. Using just a few highlight words explain it. Some refuse to use AI voices. Others are betting their futures on it. Voice-related businesses that are already on the market, from Amazon to Google, will make it harder to compete for every other voice tech company.

Two companies at the top of the list are DeepZen and Speechki .

According to Marzia Ghiselli, head of publisher partnerships at DeepZen, all DeepZen voices are licensed and cloned from human narrators. The company does not use generic off-the-shelf voices from Amazon or Google. Pseudonyms are used to make it clear that the audiobook is voiced with AI technology. It takes mere weeks to create DeepZen productions.

Speechki says it has improved on that. Where DeepZen takes over a month to produce an audiobook, Siberia’s Speechki promises to convert a book into an audiobook “in days” for as little as $500. It also offers to “create your audiobook with artificial intelligence in 15 minutes.”