Can OpenAI Build Alexa Before Amazon Can Build ChatGPT?
I mean, the answer to the question in the title is: of course. But it's also a far more nuanced question that it may seem on the surface – or that could fit in a title. It's more along the lines of can OpenAI execute an AI device strategy before Amazon and Apple and Google and Meta can?
And it's especially interesting with Amazon given the reports that they're going to invest perhaps $50B into OpenAI's new funding round. From the moment it was rumored, that seemed like a wild amount of money, even for a company the size of Amazon. But certainly one that is also the largest shareholder in OpenAI's chief (startup) rival, Anthropic. And has a deep partnership with them on a few fronts. Perhaps most notably, to help power the new Alexa+ service. And it gets even more complicated with the reports that alongside this new funding, OpenAI may build custom models for Amazon's products – including, perhaps Alexa.
At the same time, we know that OpenAI is hard at work on their first devices. And ever since word started to trickle out about what they might be working on with Jony Ive's team at LoveFrom, my guess was basically along the lines of a newfangled smart speaker. As I wrote last May:
The problem with a full-on wearable in this regard is that everyone focuses far too much on the whole wearable part. That is, the exterior of the device and how it will work on your body. And then: how can I get the technology to work on that? But I suspect that OpenAI/IO are focused on the opposite: what's the best device to use this technology? Why does it have to be wearable?
To be clear, I suspect that whatever the device is, it will look fantastic – this is an Ive/LoveFrom production, after all – but that's mainly because beautiful products bring a sense of delight to users and can spur usage. I suspect the key to the design here will be yes: how it works. And again, I suspect that will be largely based around voice, and perhaps augmented by a camera.
And:
Anyway, the reporting here makes the IO device sound a bit like a newfangled tape recorder of sorts. Okay, I'm dating myself – a voice recorder. You know, the thing some journalists use to record subjects for interviews. Well, when they're not using their phones for that purpose, as they undoubtedly are 99% of the time these days. But it sounds sort of like that only with, I suspect, some sort of camera. I doubt that's about recording as much as it's about the ability to have ChatGPT "look" at something and tell you about it. But these are just guesses.
Well, they seem less like guesses now, and more like pretty solid predictions. OpenAI's device is shaping up as a sort of Amazon Echo for our modern age of AI. To me, the launch of GPT-4o was the key in showcasing where OpenAI was headed. The first true "omni" model that could properly input and output visuals and voice pointed directly towards the science fiction future of Her – a reference Sam Altman explicitly made, which got him in quite a bit of trouble...
But this was always about moving beyond the computer and perhaps even the smartphone. Or, at least, that's what OpenAI (and Meta – and even Amazon) have to hope. I tend to think all of these new AI devices are just going to reinforce the smartphone at the key hub (at least until models that can run locally on device are good and small enough), and Altman and Ive made it clear in the formal announcement of their partnership that whatever they were building would not be an iPhone replacement.
At the same time, clearly Ive was hoping it could be a device that could slowly ween people off his other famous creation. A parasitic device, in a way.
The form factor of what that parasite may look like keeps coming more into focus. On Friday, Stephanie Palazzolo and Qianer Liu published the latest such report for The Information. Noting that OpenAI now has more than 200 people working on various AI devices, including interestingly, a smart lamp. But the key one is clearly:
The smart speaker—the first device OpenAI will release—is likely to be priced between $200 and $300, according to two people with knowledge of it. The speaker will have a camera, enabling it to take in information about its users and their surroundings, such as items on a nearby table or conversations people are having in the vicinity, according to one of the people. It will also allow people to buy things by identifying them with a facial recognition feature similar to Apple’s Face ID, the people said.
To me, this sounds less like an "iPhone-killer" and more like an "Alexa-killer". Fine, fine, technically an "Echo-killer", but everyone uses them interchangeably, of course. In fact, the only thing holding it back – aside from, you know, it actually working – would be the price. $200 to $300 is more in the Apple ballpark than the Amazon one. That said, Amazon has pivoted their initial "Alexa everywhere" strategy (which forced Apple to shift their initial HomePod strategy) with cheap devices anywhere and everywhere to a more focused strategy around higher quality devices under Panos Panay.
Again, this feels like a collision course waiting to happen. And it's especially odd given the talk of the deepening relationship between OpenAI and Amazon. But if OpenAI is able to make an AI device that's great for shopping... perhaps the hope is to partner with Amazon to be the retail side of that equation. And vice versa! Maybe Amazon no longer cares if you buy using Alexa devices, just so long as you buy. That's probably smart, especially given how well (read: not well) the first wave of voice-based shopping went with Alexa.
And the two of them may be better off working together to combat Apple and Google. Not only are they the two other players battling for the home (alongside Samsung, of course), but they're the two that control smartphone platforms (alongside Samsung, thanks to Google, of course). And Apple is seemingly about to step it up in a major way in the home with the 'HomePad' device. It's thought to be a smart speaker with a camera, powered by AI. Sound familiar?
Of course Apple's, like at least half of Amazon's new Alexa+ lineup, will also feature a screen. Whereas the OpenAI device is believed not to have one. The lack of one has clearly hurt Amazon in the past for their shopping ambitions, so we'll see how OpenAI fares... But the visual input, the camera, will clearly be the other key for the device:
During a presentation last summer, leaders from the device team told employees the device will be able to observe users through video and nudge them toward actions it believes will help them achieve their goals, said a person who attended the presentation. You could imagine the device observing its user staying up late the night before a big meeting and suggesting that they go to bed, for example.
Interesting. That's going to be a very tricky set of features to promote. OpenAI will probably get more leeway than, say, Meta, but certainly not as much as Apple here. Oh yes, have I mentioned that Apple is also working on a camera-focused AI wearable? One that's meant to be the "eyes and ears" of the iPhone?
This is all shaping up for a very interesting next 12 months. Apple's 'HomePad' should come in the Spring – hopefully! – with Siri finally powered by Gemini, to meet the latest Alexa+ devices from Amazon in market. Google's own first real Gemini smart speaker should arrive around the same time. Meta will continue to iterate on the Ray-Bans while Apple could meet them in market in late 2026 or early 2027. Then this OpenAI smart speaker should hit...
One more thing: As The Information report notes, Adam Cue is one of the key players on the software side of this new OpenAI device, having come over from the acquisition of the io team. Cue is, of course, the son of longtime Apple SVP Eddy Cue. Another fun wrinkle in the race!










