Amazon demos Alexa reading a bedtime story in the voice of a boy’s deceased grandma
Amazon’s intelligent, voice-enabled assistant Alexa has become an integral part of everyday experiences. Alexa gets more than 1 billion requests per week, Amazon said Wednesday, while customers have access to more than 100,000 Alexa skills.
Now, the technology giant is developing a new capability for Alexa, so she can help you remember loved ones who have passed away: the ability to communicate with others’ voices. On Wednesday at the re:MARS conference (Amazon’s event for Machine Learning, Automation, Robotics, and Space), Amazon’s Rohit Prasad briefly described the skill.
He showed a short video of a boy speaking to an Amazon Echo speaker. “Alexa,” the boy asks, “Can grandma finish reading me ‘The Wizard of Oz’?” A woman’s voice begins speaking, and Prasad confirmed the voice was supposed to be that of his deceased grandmother.
“One thing that surprised me the most about Alexa is the companionship relationship we have with it,” said Prasad, Alexa AI SVP and head scientist. “Human attributes of empathy and affect are key for building trust. They have become even more important in these times of the ongoing pandemic when so many of us have lost someone we love. While AI can’t eliminate that pain of loss, it can definitely make their memories last.”
Prasad didn’t say when the skill will be available — he said Amazon is “working on” it. An Amazon representative told ZDNet that it has nothing further to share yet regarding the timing of its availability.
Plenty of questions have already risen about the ethics of replicating a real person’s voice, but Amazon’s Nate Michel stressed to ZDNet that it’s “early days,” and this technology is “exploratory at this stage.”
Generating a voice like this presents a technical challenge, Prasad explained in his remarks, because it requires producing a high-quality voice with less than a minute of recording, versus hours of recording a voice in a studio. Prasad’s team addressed the challenge by doing it as a voice conversion task rather than a speech generation task.
“We are unquestionably living in the golden era of AI, where our dreams and science fiction are becoming a reality,” Prasad said.
To make Alexa even more human-like, Prasad shared how Amazon is building generalizable intelligence into the tool. Generalizable intelligence comprises three key attributes: learning across many different tasks, continually adapting to user environments and learning new concepts through self-supervision.
Amazon is working on approaches like think-before-you-speak, in which Alexa effectively uses “implicit commonsense knowledge” (built with a large language model and a commonsense knowledge graph) to generate responses to a user.
For instance, if a customer on Valentine’s Day says, “Alexa, I want to buy flowers for my wife,” Alexa could leverage world knowledge and temporal context to respond with, “Perhaps you should get her red roses.”