📰 This New AI From NVIDIA Generates Videos With Text - Examples

At the IEEE conference, NVIDIA presented its new technology: a video generator based on an open-source model of Stability AI, an artificial intelligence trained to generate images from text. NVIDIA researchers managed to add an extra step where they try to animate an image. The AI is based on insights from the analysis of thousands of videos on the Internet.

The AI estimates what is likely to change in each area of an image. From there, keyframes are created throughout the sequence, and another frame generator is used to create “connections” between the keyframes. This generates images of similar quality that can be inserted into the sequence, creating a dynamic result: a video.

NVIDIA tested the system with low-quality dashcam footage to produce consistent, minute-long driving videos at 512 x 1024 pixels. A big step forward in content generation. While still in its infancy, the video’s images are stunning and coherent, but still lack realism.

The system is designed for taking pictures as well as for text queries. This means that we might be able to upload our own images, or maybe even images generated by another AI, and then develop those into videos. The NVIDIA team used their technology to generate a variety of sample videos at 1280 x 2048 resolution simply from text queries.

In the near future, it may be possible to merge these AIs to create content in record time, such as the rise of art-generating AIs we’re witnessing.

Currently, NVIDIA views this system more as a research project than a consumer product.