The GPT-4o Asteroid

Alexa and Siri look towards the sky and see their doom...
There are two things from our announcement today I wanted to highlight...

There is a lot to say about GPT-4o. Definitely column-worthy – either later today or early tomorrow. For now, just watch the demos. My initial gut reaction is that this is actually another watershed moment, like DALL-E and then, of course, ChatGPT. I'm wary of overhyping this, but it feels like OpenAI has yet again figured out a way to cross a chasm. Alexa and Siri may have thought they already did that, but they are basically dinosaurs whereas ChatGPT is a human being. GPT-4o is the asteroid those dinosaurs are now staring up towards...

How's that for not overhyping it?

Anyway, Sam Altman's brief post on the release today is not only a succinct way to frame the new product, but also, it seems, the company as a whole going forward:

First, a key part of our mission is to put very capable AI tools in the hands of people for free (or at a great price). I am very proud that we’ve the best model in the world available for free in ChatGPT, without ads or anything like that.

Our initial conception when we started OpenAI was that we’d create AI and use it to create all sorts of benefits for the world. Instead, it now looks like we’ll create AI and then other people will use it to create all sorts of amazing things that we all benefit from.

We are a business and will find plenty of things to charge for, and that will help us provide free, outstanding AI service to (hopefully) billions of people.

It sounds a bit flippant, and perhaps it is. But it's also a great way to frame a mission in a plain English manner. As for GPT-4o:

Second, the new voice (and video) mode is the best compute interface I’ve ever used. It feels like AI from the movies; and it’s still a bit surprising to me that it’s real. Getting to human-level response times and expressiveness turns out to be a big change

The original ChatGPT showed a hint of what was possible with language interfaces; this new thing feels viscerally different. It is fast, smart, fun, natural, and helpful.

Chatbots were a good entry-point in that everyone knew how to use them. But they were always going to hit a wall, just as chatbots always do. Now OpenAI has figured out a way to break through that wall. To break out of the chatbox. There's a reason science fiction dating back decades had people talking to computers: it's the most natural way to interact with well, anything. We didn't have the capabilities to do it before – we thought we did, but we didn't: it's both the product and the technology, too many teams focus on just one of those – but now we do.

Obviously, you're not always going to need to or want to talk to a computer. And that's fine, that's why we have other tools and methods of input. Keyboards, mice, etc. It was always going to be a concert of computing.1 But just as Apple ushered in touch with the iPhone, voice needed a new product to usher it in.2 Siri and Alexa had their shot but they weren't nearly smart enough, which in many ways was worse than not working at all. This looks like it works. We'll see – and hear – soon enough...3

Update May 14, 2024: A few more thoughts...

1 I almost can't believe I wrote these posts seven years ago. Appropriate usage of Her imagery too...

2 Her is getting all the love here, and rightfully so -- clearly the voice they gave GPT-4o in the demos was intentional... but let's not forget TARS from Interstellar! Cough, cough.

3 The rumors suggesting Apple could partner with OpenAI for iOS 18 are particularly interesting here. What if GPT-4o plays nice with Siri? Would Apple have them, um, courage to allow such a fundamental outsourcing?