New Software From OpenAI | Stunning AI-generated Videos

(San Francisco) In April, New York startup Runway AI introduced software that lets you create videos — such as a cow celebrating its birthday or a dog talking on the phone — simply by typing a sentence on a computer .

Published at 1:04 am. Updated at 7:00 a.m.

Cade Metz The New York Times

The four-second videos were blurry, choppy, distorted and disturbing. But they have clearly shown that we are not far away from using artificial intelligence (AI) technologies to produce very compelling videos.

Barely 10 months have passed and San Francisco-based OpenAI has just unveiled a similar system that creates Hollywood-quality videos. The short sequences, created in just a few minutes, show woolly mammoths running across a snowy meadow, a monster looking at a melting candle, and a street scene in Tokyo that appears to have been filmed by a camera flying over the city.

Unfortunately, your browser does not support videos

According to OpenAI, Sora was provided the following description to create this video: “An elegant woman walking down a street in Tokyo filled with bright neon lights and animated street signs. She is wearing a black leather jacket, a long red dress, black boots and a black handbag. She is wearing sunglasses and red lipstick. She walks confidently and relaxed. The road is wet and reflective, creating a mirror effect with the colored lights. Many pedestrians walk. »

OpenAI, behind the ChatGPT chatbot and the DALL-E still image generator, is one of many companies working to improve these types of instant video generators. For example, the start-up Runway and tech giants like Google and Meta (Facebook and Instagram). This technology could accelerate the work of experienced filmmakers while completely displacing young digital artists.

It could also be a quick and inexpensive tool for creating disinformation online, making it even more difficult to distinguish fact from fiction online.

“The potential impact of something like this on a close election absolutely frightens me,” said Oren Etzioni, a professor of AI at the University of Washington and founder of True Media, a nonprofit dedicated to tracking online misinformation during elections.

“The idea of unlimited creative potential”

OpenAI called its new system Sora (“sky,” in Japanese). The team behind the software, led by researchers Tim Brooks and Bill Peebles, chose the name because it “evokes the idea of limitless creative potential.”

According to MM. Brooks and Peebles, Sora is not yet available to the public as OpenAI is still working to understand its dangers. OpenAI limits access to Sora to a select group of academics and other outside researchers who test it to identify potential malicious uses.

Our goal is to provide a glimpse of the future so that users can see the possibilities of this technology and share their observations with us.

Tim Brooks, researcher at OpenAI

OpenAI already watermarks videos produced by Sora, identifying them as generated by artificial intelligence. However, the Company acknowledges that these watermarks may be removed. They can also be difficult to spot. (The New York Times added “AI-generated” watermarks to the videos in this article.)

Sora is an example of generative AI, capable of instantly generating text, images and sounds. Like other generative AI machines, Sora learns by analyzing digital data, in this case videos and captions that describe the content of those videos.

OpenAI refuses to say how many videos were provided to the system and where they came from, but says the content includes public videos and copyrighted videos. The company says little about the data it uses to train its technologies to maintain its edge over competitors — and because it has been repeatedly sued over its use of copyrighted content.

(In December, The New York Times sued OpenAI and its partner Microsoft for copyright infringement of news content related to AI systems.)

Not always perfect

Sora generates videos in response to short descriptions like “a beautiful coral reef full of colorful fish and sea creatures.” Videos can be impressive, but they aren't always perfect and some images can be strange and illogical. So Sora recently created a video of someone eating a cookie, but the cookie never got smaller.

In just a few years, DALL-E, Midjourney and other still image generators have improved to the point where they produce images that are almost indistinguishable from real photos. This makes it harder to spot misinformation online. Many digital artists complain that it is more difficult for them to make a living.

“We all laughed when Midjourney produced its first images in 2022,” says Reid Southen, a digital filmmaker from Michigan. “Today, Midjourney is putting people out of work. »

This article was published in The New York Times.

Read the article on the New York Times website (subscription required)

“The idea of ​​unlimited creative potential”

Not always perfect

“The idea of unlimited creative potential”