Cloning voices: The opportunities, threats and needed safeguards

By Dr. Mohamed Lazzouni
Published 2 years ago

Microsoft recently made headlines by announcing it is working on a form of artificial intelligence (AI) called VALL-E that can clone voices from a three second audio clip. Imagine that now AI can have anyone’s voice say words without that individual actually speaking! Even more recently, Samsung announced that its software assistant, Bixby, can now clone users’ voices to answer calls. Specifically, Bixby now lets English speakers answer calls by typing a message which Bixby converts to audio and relays to the caller on their behalf.

Technologies like VALL-E and Bixby are bringing voice cloning to reality and have the potential to be industry game changers. The term voice cloning refers to the ability to use AI to build a digital copy of a person’s unique voice, including speech patterns, accents and voice inflection, by training an algorithm with a sample of a person’s speech. Once a voice model is created, plain text is all that’s needed to synthesize a person’s speech, capturing and mimicking the sound of an individual. In fact, many different types of voice cloning companies are now launching, making this technology much more accessible.

AI-based voice cloning, when done ethically, can have many excellent applications, especially in the entertainment industry. For example, imagine being able to listen to the voice of your favorite actor narrating your grocery list as you walk through the aisles. In the unfortunate occurrence that an actor passes away in the middle of production, their voice can still "complete" the film through the use of a deep fake voice.

Another area where voice cloning can be beneficial is helping individuals with speech disabilities. In this instance it is possible to create a synthetic voice which can assist impaired individuals with the ability to express themselves in a voice that is uniquely their own. For example, a patient with throat cancer who may need to undergo removal of the larynx, could have his voice cloned prior to surgery in order to replicate a voice that sounds more like their old selves.

On the other hand, there are some real issues with this technology going mainstream. Beyond the obvious ethical concerns, creating and using a replica of someone’s voice without their permission, and potentially for malicious activities, is a serious violation of identity and privacy. There are also legal considerations where voice cloning can be maliciously used to defame, deceive or incriminate people. While there are bound to be cases of scam artists recording people unknowingly and against their will, we must implement the same opt in/opt out consent procedures that have become commonplace for facial recognition, anytime we endeavor to record a person’s voice. This is the only way to enable people to maintain control over their unique, natural biological identifiers.

Regarding scammers, the potential for misuse is sky-high. Until recently, to clone a voice you would need a large amount of recorded speech to train the algorithm. But voice cloning technology is evolving so quickly that today all that’s needed is a few minutes of speech -- or in Microsoft VALL-E’s case, a few seconds. This means, if a scammer gets you on the phone for as little as three seconds, that is all they need to synthesize your voice without your consent. In fact, the FBI has already issued warnings of voice cloning technologies used in grandparent scams, whereby scammers call elderly couples and mimic a loved one saying they are in jail, trapped in a foreign country or in other difficult situations in order to extort money. Unfortunately, we can expect to see voice cloning used for other roguish purposes as well, such as creating deep-fakes of politicians making remarks that may spread misinformation or evoke controversy.

Another significant consideration is the fact that many organizations rely on voice recognition as a form of biometric authentication -- think of, say, an emerging fintech that uses voice recognition to enable users to access accounts and exchange funds. Where voices are concerned, it can be very hard to tell what is real and what isn’t. As voice cloning breaks out into the real world -- as many expect it will -- these organizations are going to have to take steps to ensure their systems aren’t subverted by malicious use.

There are two key ways that organizations can do this. One is by implementing liveness detection, a process that is already widely used in facial recognition. Liveness detection thwarts attempts at duping a system, by deciding whether it’s really a live person or a spoof -- like a photo or video or using a voice recording as opposed to a live voice. A second technique involves adopting multi-factor authentication (MFA), so that if a person’s voice is identified, he or she will be prompted to provide a second form of authentication such as a password or a one-time code sent to their mobile device. These secondary authentication methods are not foolproof (both can be intercepted) and they can introduce some user friction, but they can be effective in helping guard against spoofs.

In summary, voice cloning is an exciting new frontier that can deliver many benefits, especially in the area of helping those with speech disabilities. But we need to be cautious with this promising technology, as the potential for ethical and legal liabilities and scamming can be significant. This is why organizations that have invested in voice recognition as a form of biometric authentication would be well-advised to take extra measures to guard against scam threats.

Image credit: nevarpp/depositphotos.com

Dr. Mohamed Lazzouni, is CTO, Aware.

No Comments

Comments are closed.

Cloning voices: The opportunities, threats and needed safeguards

Recent Headlines

Almost half of enterprises not prepared for quantum threats

Cooler Master launches MasterFrame 500 Mesh open-frame ATX chassis

Deception is evolving, and security teams need to catch up

Autonomous DLP platform aims to fight insider threats

Google makes cheaper YouTube Premium Lite available more widely

Opera files antitrust complaint against Microsoft in Brazil, alleging unfair browser restrictions on Windows

UK VPN interest surges in response to new Online Safety Act

Most Commented Stories

Windows 11 25H2 has a new option to remove all unwanted Microsoft apps

This new Windows 11 clone is actually Linux and runs faster on your old PC -- get it now

Half of Americans think AI is a threat, the other half don't. Who's right?

This ergonomic AI mechanical keyboard is built for modern productivity

UpDownTool lets you move from Windows 11 to Windows 10 in just 5 clicks -- without losing any data

Never mind Windows 11, Windows Classic Remastered is the nostalgic Microsoft operating system you didn't know you wanted

IObit Software Updater 8 makes app updates faster and safer -- download it now

Facebook introduces the biggest change to text posts in years