Is a Ferret Burrowing Inside Siri?

Putting some pieces together for Siri's AI upgrade...
Apple Plans AI-Based Siri Overhaul to Control Individual App Functions
Apple Inc. is planning to overhaul its Siri virtual assistant with more advanced artificial intelligence, a move that will let users control individual app functions with their voice, according to people with knowledge of the matter.

For all the talk about Apple and AI leading up to WWDC, there hasn't been much detail specifically on the changes to Siri beyond the notion that she would be "improved" (insert joke about there being nowhere to go but up). Here we go:

Apple Inc. is planning to overhaul its Siri virtual assistant with more advanced artificial intelligence, a move that will let users control individual app functions with their voice, according to people with knowledge of the matter.

The new system will allow Siri to take command of all the features within apps for the first time, said the people, who asked not to be identified because the initiative isn’t public. That change required a revamp of Siri’s underlying software using large language models — a core technology behind generative AI — and will be one of the highlights of Apple’s renewed push into AI, they said.

And:

Siri will be a key focus of the WWDC unveiling. The new system will allow the assistant to control and navigate an iPhone or iPad with more precision. That includes being able to open individual documents, moving a note to another folder, sending or deleting an email, opening a particular publication in Apple News, emailing a web link, or even asking the device for a summary of an article.

You know what this sounds like? This sounds like Ferret, the project which was first quietly revealed through a university partnership late last year and further detailed in a paper which Apple published on the breakthrough. That technology allows AI to "understand" the UI of a smartphone, which in turn, would allow it to interact with a smartphone without the need for specific APIs (presumably).

As I wrote a couple months ago:

Imagine an AI that could literally do anything a human could do on an iPhone. Just as you tap and click to interact with the UI on iOS, the AI could potentially do this, digitally. This could make it so that developers don't have to do anything to allow their apps to be fully compatible with something like Shortcuts – or yes, Siri.

It's complex enough that part of me is sort of surprised that Apple is going to announce it – if this is in fact what it is – at WWDC. Then again, I also guessed the other day that this might be something Apple surprises with, that wasn't yet scooped by Gurman. (Until he did, a few days later...) But per this report, it sounds like it will be limited to Apple's own apps to start. And:

At the start, the new Siri will handle one command at a time, but Apple has plans to to allow users to chain commands together. For example, they could ask Siri to summarize a recorded meeting and then text it to a colleague in one request. Or an iPhone could theoretically be asked to crop a picture and then email it to a friend.

The feature is one of Apple’s more complex AI initiatives and isn’t planned for release until as soon as next year, when it will be part of a subsequent update to iOS 18, according to the people. The first version of the new operating system will launch in September, around the same time as the next iPhone models.

How much pressure does Apple feel to reveal their AI plans? They're going to unveil something at WWDC that they won't ship until next year. In recent years, features coming to iOS "later this year" are nothing new, but we're obviously now pushing beyond such boundaries. Perhaps OpenAI or other partners will have to do more of the heavy lifting to elevate Siri until then...


Update: A few tangential thoughts...

Is Siri About to Get a ChatGPT Brain Transplant?
Or simply an augmentation? Or will Apple finally launch ‘Knowledge Navigator’?