Are you replaced by Text-to-Speech software?
Should voice-over artists be afraid of artificial unintelligence? Will robots take over the role of narrator or do voice-over professionals still have a future?
The man, who had lost his voice from thyroid cancer, spoke again on the Oprah Winfrey show. Last Tuesday, Film critic Roger Ebert gave his Oscar predictions with the help of text-to-speech (TTS) software that speaks whatever he types.
The first computer-based speech synthesis systems were created in the late 1950s. They’ve come a long way, but a lot of TTS software still sounds rather robotic and unnatural. That’s why Ebert turned to Scottish firm CereProc for help. CereProc actually uses someone’s audio recordings to create a digital voice that comes very close to the real thing.
Usually, CereProc has people come in to their studio and record about 15 hours of audio. This is used to re-create the original voice. In Ebert’s case, they used audio commentary he had made for several DVD documentaries. The quality was poor and the recordings were not as long as they would have liked. Nevertheless, they did the impossible and gave Ebert his voice back.
Here’s Ebert with his wife Chaz, as he first tries out his new voice:
OUR NEW COMPETITOR
TTS software is not only used for people who have lost the ability to speak. It’s used to capture accents and dialects that are on the verge of dying out. People also use it to learn a foreign language. There’s one other application you should be aware of: it could be used to replace you and me!
Poland-based Ivona Text-to-Speech advertises:
“Save money spent on voice talent recordings. You do not have to look for recording studios and speakers. You do not waste time concluding agreements and contacting the contractors and it’s accessible 24/7.”
If you want to get an idea of what this software is capable of, go to their website; type in a few words and have a digital voice read it back to you.
Rival NeoSpeech, headquartered in California claims:
“Robotic voices are now history.”
Neospeech offers nine different voices that speak US English, Mexican Spanish, Korean, Japanese and Mandarin Chinese for a wide range of hand-held devices, desktop and network/server applications.
POLITICAL VOICES
If it weren’t for a certain former president, Roger Ebert might never have found CereProc. Ebert came across the Bush-o-Matic talking head, a hilarious re-creation of the 43rd president. I must admit: Bush never sounded so articulate! You can make him say things that are intelligent, and even make him wink, squint or blink.
The CereProc engineers pieced the voice of Bush together from his weekly radio address. It’s kind of scary, but in a fun way. Just to be fair, they added a virtual version of president Obama’s voice and the inimitable accent of California governor Schwarzenegger.
As you can tell from the audio samples, CereProc is getting close, but we’re not quite there yet. One of the biggest challenges any TTS provider needs to overcome, is how to add some emotion to the speech. Most voices still sound a bit flat and get very boring very quickly. And for ordinary mortals, it’s still too expensive to re-create their own voice with the help of this technology.
TIME TO GO?
So, do you think it’s getting time for professional voice-overs to pack their bags and start looking for other work? Yes and no.
First of all, text-to-speech companies all over the world use voice talent to record different languages and accents for different applications.
Secondly, if you’re a musician, you might find this technological development very interesting but non-threatening. As you probably know, any musical instrument under the sun has been sampled, and entire symphony orchestras can come out of a can. Yet, people are still buying real Steinways, and there are plenty of musicians who make a very decent living.
Do you think that we’ll ever see the time when Stravinsky’s “Rite of Spring” as performed on virtual instruments, will win a Grammy? I don’t think so. Will a laboratory ever be able to produce a recording of Bach’s cello solo sonatas that rivals the depth of Yo Yo Ma’s interpretation?
You see, there’s still hope for the most subtle, most flexible, most surprising and unique of all instruments: the human voice. Here’s the rub: robots have a hard time emoting. They can patiently and dispassionately guide you to the next exit, but they have a hard time expressing even the most basic of feelings such as fear, anger, hurt, guilt and… love.
MOVIE MAN
As for Ebert, he’s as busy as ever. Esquire Magazine recently published a very moving article about him. On “Oprah”, he predicted that Kathryn Bigelow would be crowned best director and also pegged her film “The Hurt Locker” as the best picture winner. We now know that he was spot-on.
If it were up to me, this year’s “Special Achievement Award” would only go to one man:
Roger Ebert.
Time and again, his brilliant movie reviews leave me… speechless.
Paul Strikwerda © 2010
PS Do you see Text-to-Speech software as a threat to your career? Is it eventually going to put you out of business, or will it just do the boring work? Share your thoughts!
PPS Why am I giving my voice-over services away for FREE? Find out here…














































