Seeing eventual limitations to simply giving Alexa commands — “Alexa, buy milk,” “Alexa, turn the AC down,” — Amazon’s engineers believe the voice recognition AI needs to think for itself.
At Amazon’s re:MARS event that wraps up Friday (June 24) in Las Vegas after four days of buzz around artificial intelligence (AI), machine learning and robotics, people are talking about news of next steps for Alexa, imbuing the system with higher-level functions that are constantly running in the background.
A Wednesday (June 22) Amazon blog post talked up the hot story coming out of re:MARS, involving the work of Rohit Prasad, Alexa AI senior vice president and head scientist, and his team, now working on moving from rote Alexa skills to more interactive forms like conversational AI.
While calling Alexa “one of the most complex applications of AI in the world,” Prasad said, “our customers demand even more from Alexa as their personal assistant, adviser and companion.”
“To continue to meet customer expectations, Alexa can’t just be a collection of special-purpose AI modules. Instead, it needs to be able to learn on its own and to generalize what it learns to new contexts. That’s why the ambient-intelligence path leads to generalizable intelligence,” he said.
Prasad noted that Alexa already exhibits some semi-intuitive behavior, like being given a generic command to set a reminder to watch the Super Bowl, then using its own processing power to identify the correct event, date and time automatically. Another use case is around Alexa ‘Hunches,’ which monitors user routines and gives reminders if they’ve missed something.
Prasad said this is all moving to more of a “common sense” core to Alexa’s AI, noting that “we are aspiring to take automated reasoning to a whole new level. Our first goal is the pervasive use of commonsense knowledge in conversational AI. As part of that effort, we have collected and publicly released the largest dataset for social common sense in an interactive setting.”
See also: Voice and Messaging Solutions Sing Out as Conversational AI Platforms Gain Momentum
Speaking of Voice Breakthroughs
As a prime digital doorway to the connected economy, market watchers have seen Alexa developments ramping up to more advanced conversational AI that will enable new experiences — and ways to pay for them — using voice to leapfrog touch biometrics and typing.
In September, PYMNTS’ Karen Webster wrote that Alexa’s “voice AI operating system has the potential to move consumers and businesses closer to an always-on connected commerce ecosystem, by leveraging that trust and embedding payment and identity credentials into a growing portfolio of connected devices powering new use cases that define the consumer’s daily routine.”
“It’s something that will become far more important as more and more devices get deployed through the retail and commercial physical space, penetrated by super-fast 5G,” she said.
See also: Voice Operating Systems Like Alexa Will Power the Connected Economy: Here’s Why
That holds with the new ambient intelligence use cases Amazon is now developing as discussed at re:MARS over the past week, as engineers make the system more intuitive, able to make connections between seemingly disparate topics, and say something useful as a result.
Prasad said: “If I had to pick one thing among the suite of capabilities we showed at re:MARS, I’d say it is conversational explorations,” adding that “we are enabling conversational explorations on ambient devices, so you don’t have to pull out your phone or go to your laptop to explore information on the web.”
“Instead, Alexa guides you on your topic of interest, distilling a wide variety of information available on the web and shifting the heavy lifting of researching content from you to Alexa.”
To make this a reality faster, the Alexa team is focusing on three areas primarily: dialogue flow prediction through deep learning in Alexa Conversations; web-scale neural information retrieval to match relevant information to customer queries; and automated summarization, to distill information from one or multiple sources, according to Prasad.
Read more: Voice Will Tie the Connected Economy, and Banking, Together