Sora: OpenAI’s spectacular new video tool has taken the world by storm

20240131 artificial intelligence — OpenAI's Sora can take simple text prompts and generate original video, such as woolly mammoths walking through snow. Image Credit: AFP

Every new OpenAI announcement sparks some measure of awe and terror. Late on Thursday, the maker of ChatGPT dropped its latest new gizmo, a text-to-video model called Sora that can create up to a minute of high-quality video. Cue a flood of remarkable AI clips going viral on social media, while stock video producers, filmmakers, actors and some startup founders likely fretted about their livelihoods.

AI video-generation has been around for more than a year, but Sora’s examples, including a high-definition clip of puppies shaking their fur in the snow and a sleepy woman being woken up by her cat, look more realistic than previous efforts. The glitches are harder to spot and the humans look more human.

As usual, OpenAI won’t talk about the all-important ingredients that went into this new tool, even as it releases it to an array of people to test before going public. Its approach should be the other way around. OpenAI needs to be more public about the data used to train Sora, and more secretive about the tool itself, given the capabilities it has to disrupt industries and potentially elections.

Publicly available content

OpenAI Chief Executive Officer Sam Altman said that red-teaming of Sora has started the day the tool was announced and shared with beta testers. Red-teaming is when specialists test an AI model’s security by pretending to be bad actors who want to hack or misuse it. The goal is to make sure the same can’t happen in the real world.

The company spent about six months testing GPT-4, its most recent language model, before releasing it last year. If it takes the same amount of time to check Sora, that means it could become available to the public in August, a good three months before the US election. OpenAI should seriously consider waiting to release it until after voters go to the polls.

The impact of deepfakes on elections has already been well documented, and cloned voices of politicians can be the most insidious and hard-to-track uses of this new crop of generative AI tools. But faked videos can sow confusion and chaos too. Imagine an adversary uses this tool to show unimaginably long lines in bad weather to convince people it’s not worth it to head out to vote that day.

OpenAI uses safety filters to keep its models from responding to any prompts for extreme violence, sexual content or hateful imagery, but it’s impossible to know how these systems will be misused and used for propaganda until they’re out in the wild. And Sora’s launch would make a bigger impact than other video-generation tools thanks to ChatGPT’s ubiquity, with more than 100 million people using it each week.

In the same way OpenAI folded its image-generation tool DALL-E 2 into ChatGPT, it’ll likely do the same with Sora, instantly putting video generation into the hands of millions.

OpenAI is meanwhile quiet about the source of the information it used to create Sora. When asked about what datasets were used to train the model, a spokeswoman said the training data came “from content we’ve licensed, and publicly available content.”

When Meta Platforms Inc. released a text-to-video model in 2022, it used 10.7 million Shutterstock videos and 3.3 million YouTube videos to train it. Revealing this information is critical to help outside researchers check for bias, and for creators to know if their work is being exploited.

Some gaming and AI experts have suggested Sora was trained on the underlying physics engines of computer games — but no one knows for sure because OpenAI won’t disclose that information, just as it didn’t for its other powerful AI models.

Surpassing our own capabilities

The race that Altman set off when he released ChatGPT has fueled OpenAI’s secrecy as it seeks to stay ahead of leading AI competitors like Google and Meta. Altman’s formula of increasing the amount of computing power used to train OpenAI’s models is also working.

The company’s technical paper for Sora demonstrated that as it increased “training compute,” its AI videos became more and more realistic.

Little wonder that Altman has been gunning to make more AI chips and seeking funding in the trillions of dollars. His stated goal is nothing less than the creation of AI that surpasses our own capabilities, and he believes that to get there, the public needs to get to grips with ever more transformative tech by trying it out themselves. That’s the “open” part of OpenAI, but on that, he could be a little more closed.

The worrying part is that no one, not even Altman, knows what kind of impact such tools will have when they are fully unleashed. — Bloomberg

Parmy Olson is a columnist covering technology. She is also the author of “We Are Anonymous.”