Imagine being able to talk to a whole new improved Google Assistant or have a GPS system that talks clearer than it does now. Imagine having a computer program that allows you to talk freely and understands every bit of it. Just picture out a clearer conversation between you and your devices.
Wouldn't life be better? Enter DeepMind's new speech-generation breakthrough. Its U.K.-based unit (Google has acquired) created what's called as WaveNet. This new program might just be the future of the human-technology communication.
Many computer-generated speech programs have already successfully allowed AIs to talk like humans but with certain limitations. For instance, some programs utilize speech fragments from recordings of a single human narrator and combine these to form different words. While that led to a completely understandable and almost naturally sounding human speech, its voice can't be modified easily. Other programs use solely electronics to form the voice and are usually just instructed to follow the basic syntax of human language. While such enable people to manipulate the sound of its voice, they tend to sound like robots.
WaveNet sets itself apart as it mimics how some parts of the human brain functions. With that, it is able to learn raw sound waves of a human voice, one sample at a time. This consequently enable it to mimic human speech. In researcher's blind tests for U.S. English and Mandarin Chinese, WaveNet-generated speech outperformed the current best text-to-speech systems of Google. And while it isn't as flawless as the actual human speech, its still better than other existing technologies (by 50% to be exact). However, such program still requires too much computational power so I guess we still have to wait for the researchers to solve this challenging task.
Nonetheless, just imagine the super-intelligent computers this can create in the future! Will this lead to an intuitive and unique entity (operating system) like Samantha in the romance film entitled "Her"?