Apple Is Experimenting With A New Technology For Image And Video Generation

Tim Cook has been repeating it again and again for several months: Apple takes generative artificial intelligence seriously. Very good, but what to do? We’re starting to find out with the release of a technical report from the company’s AI researchers.

Images created by Apple researchers using Matryoshka diffusion models.

In this research report, Apple specialists present a new family of models for generating high-resolution images and videos. These models differ from others in that they do not need to be trained with upscaling modules to generate high-resolution content.

The principle of the technology is reflected in the name “matryoshka diffusion models”: at each stage of image creation, the model “nests” the work carried out in the lower resolution into the higher resolution, like Russian dolls fitting into each other. According to Apple researchers, this method of sharing representations across different resolutions results in faster training with high-quality results.

Diagram of how matryoshka diffusion models work. Apple graphics.

These templates can be used to increase the resolution of a small image or generate content from a text command, opening up many possible uses. In their progress report, the experts do not provide any information about the computing power required for these operations, an obviously crucial point with regard to possible integration with Apple’s operating systems and applications.

Bloomberg recently stated that Apple wanted to deploy AI almost everywhere in its ecosystem (Siri, Xcode, iWork, Apple Music, etc.), but the image section was not mentioned in this initial plan.

iOS 18, Siri, iWork: Apple is preparing to introduce AI across its ecosystem