M.G. Siegler • #ai • Jul 29, 2024

The Ground Shifts Under AI's Feet

A continued push towards products as the arms race evolves...

Alex Heath uses Mark Zuckerberg's Llama 3.1 announcement (which, as always seems to happen when one company announces something, led a bunch of other companies to peek their heads up to share some AI-related news) to take a step back and look at the current state of things:

Those competing in this race will always need the best models, and I’m not saying that frontier model development is slowing down by any means. But there seems to be a growing realization that a combination of the best AI products with the most distribution will win over the long run. Zuckerberg gets this: while announcing the new Llama model, he also predicted that Meta’s AI assistant will surpass ChatGPT’s usage before the end of the year.

Yes, I think this is also the notion that 'Apple Intelligence' was entirely built around, a shift from sheer performance to products. To be fair, that was always going to be a bit self-serving on Apple's part as it's not clear they could fully compete in the models arms race right now if the race is just about benchmarks. Then again, per the point above, I'm not sure we haven't already moved beyond benchmarks and on to products. The most important benchmark going forward may simply be cost. (Both cost for developers to use but also costs to train!)

Meta, as well, has been trying to shift to products, with some middling success thus far. That's also somewhat self-serving as their open source approach is less predicated on "winning" an arms race and more around subverting it. Meta also is trying to leverage the absolute advantage it does have: distribution. And that, at least, seems to be working, if Zuckerberg's prediction is to be trusted.

Enter OpenAI, the preeminent example of what early success in productizing AI models looks like. A refrain I’ve heard over the past couple of months from people close to OpenAI (and those with access to research like credit card purchasing data) is that ChatGPT subscription growth has stalled. At the same time, recent reporting shows that the vast majority of OpenAI’s revenue comes from ChatGPT subscriptions, not selling its models to developers. OpenAI is burning billions a year and needs to control its own destiny. For now, ChatGPT revenue growth is the key.

Before Zuckerberg went all in on open source and the release of GPT-4o in May, OpenAI pay-gated its best model in ChatGPT. Now, it seems focused on making its consumer subscription compelling with new features, not the newest model. On Thursday, it teased its long-rumored search engine, which will eventually be baked into ChatGPT once the kinks are ironed out. (Sadly, Microsoft tried to make a splash about its re-jiggered AI search experience the day before, and barely anyone noticed.) Sam Altman also said that the Her-like voice mode is coming to subscribers starting next week.

Good to hear the GPT-4o voice stuff is going to ship, even just in alpha, this week. Yes, the Scarlett Johansson fiasco undoubtedly threw a wrench in some things, but it has taken entirely too long to get this out the door when they made it seem like it would ship pretty immediately back in May. Per Heath's points, for OpenAI to stay at the forefront here, they need to keep iterating on product. Fast.

As an aside, one has to wonder if Microsoft has a growing level of frustration here as they're increasingly being sidelined and overlooked in any AI product discussions...

“I think the industry is still early on its path towards product market,” Ahmad Al-Dahle, Meta’s VP of AI, told me earlier this week. “I think usage is going to change as the interface changes and as the models get more and more capable.” He demurred when I pressed for specifics, other than to say that how we think of a chatbot is going to evolve as we get “more precise control over multiple steps that the model is able to generate.”

The idea, at a high level, is that the chatbot interface becomes abstracted away as it does more on your behalf. Things get wilder when you introduce more multimodality and form factors beyond the phone, such as Meta’s early implementation of AI in its Ray-Ban smart glasses.

Yes. This is why the GPT-4o voice layer is critical. It's also not the be-all/end-all solution, but it's a piece of the puzzle and starts us moving beyond the chatbot/text box interface. That will also remain a piece of this – a big one – but it can't be all that there is. Another next step beyond:

When you talk to people working at the top AI labs, you realize a lot of the product work happening this year is setting the stage for what everyone is trying to crack: agents. Google’s Gemini team, for example, is working on an agent that takes over Chrome on the desktop to fully automate a task like booking a plane ticket. The model generates mouse and keyboard clicks based on screenshots of webpages it’s fed after navigating to a website via Google Search. OpenAI has been working on its own version of this — a “Computer Using Agent,” or CUA — for quite some time.

We've also been hearing this non-stop for months. But for all the talk, there's not a lot to show for it yet. Hopefully that's the next wave of announcements in the fall. And perhaps 2026 for Apple 😜

If you agree that the real money and power in AI will accrue to the products people actually use and not the models behind them, what becomes of startups like Cohere or Mistral? Cohere’s business model is selling models to other companies. It laid off about 5 percent of its staff this week after raising half a billion dollars, the vast majority of which is likely going to compute costs.

And the day after the release of Llama 3.1, Mistral released a new model that seems to match or outperform Llama 3.1 on key benchmarks. But, like Cohere, Mistral has no direct touchpoint with consumers. It also requires a commercial license for business use cases, while Llama doesn’t require a special license unless you’re a company with over 700 million users.

Good questions. Mistral seems to have a lot of Europe pushing, rooting, and backing them for obvious reasons, but the EU doesn't seem to be doing them any favors – at least not yet... Cohere has a not entirely different vibe coming out of Canada. But while it may not seem like it on the surface with mega fundings still happening regularly, there are signs that the music is slowing...

This is all good and natural. The chaos needs to subside a bit so the ground on which to build great products can stabilize.

One more thing: Heath links to this wonderful site diving into the backstory of the shirt with a Latin phrase that Zuckerberg wore while announcing (in very HD, too HD?) Llama to the world.

The phrase on Zuckerberg's T-shirt, reads:

"Annos undeviginti natus exercitum privato consilio et privata impensa comparavi"

This translates to "At the age of nineteen, on my own initiative and at my own expense, I raised an army." This line is part of Augustus' "Res Gestae," a monumental inscription documenting his achievements. The choice of this quote is telling; it reflects Augustus' pivotal role in shaping Rome's destiny through personal initiative and vision.

One guess on how old Zuckerberg was when he started Facebook...

⬇️

Some further thoughts on where this is all heading...

You might also like...

The Small NVIDIA Short

LLMs vs. the World (Models)

Apple's First-Look Fall & Secondary Spring

Let Tim Cook

Group ChatGPT