Amazon is racing to replace Alexa’s ‘brain’ with generative AI

Spread the love

Amazon is preparing to relaunch its voice-activated digital assistant, Alexa, as an artificial intelligence “agent” as its tech team scrambles to solve the challenges that have plagued the system’s AI overhaul.

The $2.4tn company has spent the past two years trying to design its Alexa chat system, which is embedded in 500mn consumer devices globally, so the software’s “brain” is replaced by generative AI.

Rohit Prasad, who heads the Artificial General Intelligence (AGI) team at AmazonHe told the Financial Times that the voice assistant still needs to clear a number of technical hurdles before its planned release.

This includes solving for “phantom” or fabricated answers, response speed or “delay” and reliability. “Illusions should be brought to zero,” says Prasad. “It’s still an open problem in the industry, but we’re working very hard on it.”

The vision of Amazon’s leaders is to turn Alexa, currently used for simple tasks as narrow as playing music and setting alarms, into an “agent” product like a personal concierge. This can include anything from suggesting restaurants to adjusting the lights in the bedroom based on a person’s sleep cycles.

By the end of 2022, the Alexa design is on the rails since the launch of OpenAI’s ChatGPT, which is supported by Microsoft. As Microsoft, Google, Meta, and others quickly incorporate generative AI into their computing platforms and develop their software services, critics question whether Amazon can catch up. Timely resolution of technical and organizational challenges to compete with competitors.

According to several employees who have worked on Amazon’s voice assistant teams in recent years, the effort is fraught with challenges and follows years of AI research and development.

Several former employees said the long wait for the planned release was largely due to unforeseen difficulties in modifying and combining the simple, predefined algorithms Alexa built with more powerful but unpredictable large-scale language models.

In response, Amazon said it is “working hard to enable more proactive and efficient assistance” on its voice assistant. The technical implementation of this scale into a live service and set of devices used by customers around the world was unprecedented and as simple as overlaying LLM on top of the Alexa service.

Prasad, the former chief architect of Alexa, said last month the release of the company’s in-house Amazon Nova models – led by the AGI team – was partly motivated by special needs for optimized speed, cost and reliability, to help AI. Apps like Alexa “get to that last mile, which is really hard”.

To act as an agent, Alexa’s “brain” must be able to call hundreds of third-party software and services, Prasad said.

“Sometimes we overestimate how many services will be integrated into Alexa, and it’s a pretty big number. These apps get billions of requests per week, so when you’re trying to make sure things get done quickly. . . You have to be able to do it in a very cost-effective way, he added.

The complexity comes from Alexa users who expect fast responses as well as extremely high accuracy. Such features contrast with the inherent probabilistic nature of today’s generative AI, which is statistical software that predicts words based on speech and language patterns.

Some former employees cite Assistant’s original qualities, including consistency and functionality, to struggle while infusing it with innovative features such as creativity and free-flowing speech.

Because of the more personal and conversational nature of LLMs, the company plans to hire experts to model the AI’s personality, voice and vocabulary for Alexa users, a person familiar with the matter said.

A former senior member of the Alexa team said that while LLMs were more sophisticated, they came with risks such as giving “sometimes completely made-up” answers.

“At the scale that Amazon operates, this can happen multiple times a day,” he says, damaging the brand and its reputation.

In June, Mihail Eric, a former machine learning scientist at Alexa and a member of the “conversation modeling team,” said: He said publicly. Amazon says it has “dropped the ball” with Alexa, becoming the “undisputed market leader in conversational AI.”

Eric pointed out that despite having strong scientific talent and “huge” financial resources, the company was “burdened with technical and bureaucratic problems”, “the information was not well defined” and “the documentation was non-existent or outdated”.

According to two former employees who worked on Alexa-related AI, the historical technology underpinning the voice assistant was inflexible and difficult to change quickly, with a messy and disorganized code base and an engineering team “spread too thin.”

In the year The original Alexa software, built on technology acquired from British startup EV in 2012, was a question-answering machine that searched the universe for the right answer, like the weather of the day or a specific song in your music library.

The new Alexa uses a host of different AI models to recognize and interpret voice queries and generate responses, as well as detect policy violations. Building software to translate between the old systems and the new AI models has been a major hurdle in the Alexa-LLM integration.

The models include Amazon’s own software, including the new Nova models, as well as an AI model from an Amazon-invested startup called Anthroponic. 8 billion dollars passed The course of the last 18 months.

“(T)he most challenging thing about AI agents is making sure they are safe, reliable and predictable,” Anthropotec CEO Dario Amodei told FTA last year.

Agent-like software “where . . . People can have faith in the system, he said. “Once we get there, we’ll release those systems.”

A current employee said more steps are still needed, such as stacking child safety filters and custom integrations with Alexa like smart lights and ring doorbells.

“Reliability is the issue – making it work almost 100 percent of the time,” the employee added. “That’s why you see us. . . Or Apple or Google is slow and ramping up.

Many third parties developing “skills” or features for Alexa said they were unsure when the new AI-enabled device would be released and how they would create new functions for it.

“We’re waiting for details and insight,” said Thomas Lindgren, founder of Swedish content developer Wanderword. “When we started working with them, they were more open. . . Then they changed over time.

Another partner said things went quiet after the first period of “pressure” put on Amazon’s developers to prepare for the next generation of Alexa.

In the year The enduring challenge for Amazon’s Alexa team, hit by major strikes in 2023, is how to make money. Figuring out how to make the assistants “cheap enough to run at scale” will be a big task, said Jared Roche, co-founder of generative AI group OctoAI.

Options being discussed include creating a new Alexa subscription service or cutting back on sales of goods and services, according to a former Alexa employee.

Prasad said Amazon’s goal was to create different AI models that could serve as “building blocks” for different applications beyond Alexa.

“We’ve always focused on customers and applied AI, we’re not doing science for science’s sake,” Prasad said. “We are doing this. . . Delivering customer value and impact is becoming more important than ever in this age of generative AI because customers want a return on investment.

Leave a Reply

Your email address will not be published. Required fields are marked *