'Sky' Speaks!
After a lot of breathless back-and-forth over the past few days about the voice of Sky/Samantha/Scarlett and OpenAI, Nitasha Tiku seems to have gotten the goods:
When OpenAI issued a casting call last year for a secret project to endow OpenAI’s popular ChatGPT with a human voice, the flier had several requests: The actors should be nonunion. They should sound between 25 and 45 years old. And their voices should be “warm, engaging [and] charismatic.”
One thing the artificial intelligence company didn’t request, according to interviews with multiple people involved in the process and documents shared by OpenAI in response to questions from The Washington Post: a clone of actress Scarlett Johansson.
That's obviously important given the question of intent here. What came first, the chicken or the egg? Did OpenAI set out to create a voice that sounded like Scarlett Johansson's 'Samantha' in Her? Or did the voice they created for 'Sky' end up sounding similar to 'Samantha' because the actress they hired has a voice that is somewhat similar to Scarlett Johansson's? But not intentionally, incidentally. These documents would seem to indicate the latter.
Also key, the timeline:
But while many hear an eerie resemblance between “Sky” and Johansson’s “Her” character, an actress was hired to create the Sky voice months before Altman contacted Johansson, according to documents, recordings, casting directors and the actress’s agent.
The agent, who spoke on the condition of anonymity to assure the safety of her client, said the actress confirmed that neither Johansson nor the movie “Her” were ever mentioned by OpenAI. The actress’s natural voice sounds identical to the AI-generated Sky voice, based on brief recordings of her initial voice test reviewed by The Post. The agent said the name Sky was chosen to signal a cool, airy and pleasant sound.
That would seem to be a good set of facts for OpenAI, assuming this actress/agent would be willing to testify as much in any potential trial related to a lawsuit. Though the hope from OpenAI, clearly, is that this doesn't go that far. In part because despite the above, CEO Sam Altman did reach out to Johansson, twice. The company would have to argue this was completely unrelated to the Johansson-sounding voice of 'Sky'. Even if they said that the outreach came because they thought 'Sky' sounded like Johansson after hearing it, that would likely be a problem, I imagine.
And while the Bette Midler precedent clearly puts the egg before the chicken – they went with her backup singer after Midler declined – there's also the element of promotion in all of this. Using someone else's IP to tout your own product. And there, the Altman tweet (alongside a handful of other OpenAI employees) which made reference to Her during the launch of GPT-4o might be problematic.
And a tangential question there is if this signaled some sort of secret intent to recreate the voice of 'Samantha'? Nothing so far seems to point to that in the real time creation of 'Sky' but you have to imagine OpenAI would be pretty worried about any discovery element of a trial that would scour emails and texts for any and every mention of all of this. If anyone at OpenAI said something along the lines of "yeah, we need the voices to work like they do in Her" – even just in passing, in a private conversation – that would potentially be game over.
More broadly, all the hoopla here is directly tied to the fears of the creative community that they're about the be crushed by technology, and specifically, AI. Lawsuit or not, that's a major PR problem that OpenAI just uncorked. Even 'Sky' herself is clearly regretting the decision to lend her voice to all of this:
In a statement from the Sky actress provided by her agent, she wrote that at times the backlash “feels personal being that it’s just my natural voice and I’ve never been compared to her by the people who do know me closely.”
However, she said she was well-informed about what being a voice for ChatGPT would entail. “[W]hile that was unknown and honestly kinda scary territory for me as a conventional voice over actor, it is an inevitable step toward the wave of the future.”
'Sky' speaks! She may have to speak a lot more in the coming weeks and months. Though probably not for ChatGPT anymore, sadly.
One more thing: lost in all the above is how interesting the creative process actually is in creating these voices/experiences:
Long before the voice auditions, Jang began developing the way ChatGPT would interact with users. She worked closely with a film director hired by OpenAI to help develop the technology’s personality. For instance, if a user asked, “Will you be my girlfriend?” Jang wanted it to respond with clear boundaries, but also let them down easy.
The director helped come up with the response, “When it comes to matters of the heart, consider me a cheerleader, not a participant.”
Jang said she “kept a tight tent” around the AI voices project, making Chief Technology Officer Mira Murati the sole decision-maker to preserve the artistic choices of the director and the casting office. Altman was on his world tour during much of the casting process and not intimately involved, she said.
They used a film director to guide the voice of 'Sky' artistically, which makes sense. Voice actors typically work with directors, even if the performance here is more open-ended and not on film. That's a key people seem to be forgetting here – as good as the GPT-40 vocal computer experience may be, at the end of the day, it's a performance driving it. It's more choose-your-own-adventure-style, but it's still a facsimile of reactions, not actual reactions. And it doesn't matter how Her ended, your experience will undoubtedly differ.