Sora, The New OpenAI Tool That Is Scaring The Whole World Technology News Technology

Although nothing could be further away Vladimir Lenin that the heyday of artificial intelligence (AI) is a sentence of his that summarizes what happened a few days ago with the announcement of Sora, the AI tool from OpenAI (the same developers of Chat GPT and Dall-e), that is capable of creating videos with a level of detail that could go unnoticed by a person outside of the audiovisual world. “There are decades in which nothing happens, and there are weeks in which decades pass,” the Soviet leader, who died in 1924 at the age of 54, said years ago.

Well, the only relationship Lenin could have with the AI is their attempts to resurrect the Bolshevik leader in the present through an image or video of him after processing a human command. There was some arbitrariness, for example when the reggaeton singer's face was montaged Bad bunny in the body of the Spanish Rosalía, in which you can hear her speaking like the singer.

To be honest, Lenin's phrase has become commonplace in the world of technology. And rightly so. From time to time, companies in the industry announce the development or improvement of products and services that, ideally, should provide assistance, relieve people of mechanical stress in their daily tasks, improve their health and extend their life expectancy in order to have more time with them your. And in fact, they adhere to it to a certain extent. But it should not cause fear and anxiety.

Although it is not yet available to the public, the progress that OpenAI has made regarding Sora is surprising and at the same time raises a certain level of concern. Using the same AI technology, hundreds of people were deceived, manipulated and defrauded by usurping their voice identities. The privacy of social media users has also been violated and boys, girls, teenagers and women who post a seemingly harmless photo and then make it into a video or image with sexual connotations are being sexualized. Not to mention the copyright issues AI has caused, or the frightening idea Chat GPT had when asked how to kill the most people with just a dollar, and similar questions he's received.

What is known about Sora?

Sora is OpenAI's new generative AI-powered model that can create realistic video scenes of up to 60 seconds from text instructions with detailed output, complex camera movements, and multiple characters with emotions.

According to the company, Sora can “generate complex scenes with multiple characters, specific movements and precise details, with visual quality and taking into account user requirements.” It also allows you to create a video from a still image, animate the content precisely and without loss of detail and existing ones Extend videos or fill some frames.

Essentially, OpenAI has trained its text (GPT Chat) and image AI (Dall-e) to understand and simulate the physical world in motion. And from what we've seen, he's done it.

ChatGPT Enterprise offers a number of important benefits for businesses.

In some videos shared by the company as a witness to its latest invention, images can be seen of two dogs climbing through the snow on a mountain, a Land Rover Defender driving down a steep and wooded road, an elderly woman cooking something that looks like a cake, including animated or implausible videos, such as sea creatures riding bicycles in the middle of the sea.
To create these videos, the users who have interacted with the tool so far have simply given Sora a series of instructions detailing what features the scene must contain, such as: B. the characters and the actions they will perform, the environment, the weather, etc. the camera movements that need to be recreated.
When the new video AI was announced, Sam Altman, CEO of OpenAI, invited people to present video ideas via his X account (Twitter). On his profile, you can see how he responds to his followers' suggestions with short videos: two golden retrievers podcasting on a mountain or a drone race on Mars with the sunset in the background.

For example, these are the instructions Sora received to produce one of the videos OpenAI uses to promote the tool: Close-up of the blinking eye of a 24-year-old woman standing at sunset in Marrakech, filmed on film 70 mm, depth of field, bright colors, cinematic. The more precise the commands, the more realistic.

The company explained that the model is able to perform such precise scenes because it not only understands what the user is asking for in their text prompts, but is also able to understand how these things exist in the physical world, including emotions.

As for how it works: Sora generates a video from other videos, which OpenAI says looks like “static noise.” In this way, the model transforms it gradually, eliminating noise in many steps until realistic images are displayed.

Additionally, like the GPT models, it uses a “transformer architecture,” which the company says enables superior scaling performance. Specifically, video images are represented as “collections of smaller data units” called patches. Thus, each patch corresponds to a token in GPT.

malfunction

As mentioned, Sora is not yet available to the general public, even to those willing to pay for it. Its use is only available to members of the OpenAI Red team, the team dedicated to researching the service to test it and check what flaws it has and what possible risks it poses.

In addition, a group of visual artists, designers and filmmakers are also researching it to provide suggestions for improvements and make it as useful as possible for creatives, as the company explains.

And the current model of the platform has defects such as: B. Difficulties in depicting some rooms, errors in camera shots or recordings, confusion between left and right, or the inability to maintain visual continuity throughout the length of the video (cause and effect). “For example, a person may eat a cookie, but then the bite cannot be seen in the cookie,” OpenAI explained.

On the other side is the issue of security. The company explained that this issue is crucial and sensitive and that it is focused on ensuring that the model it publicly presents is not in the service of fraudsters or criminals. In this testing phase with technicians and experts in security, misinformation, discriminatory and hateful content, there will be simulations with users who will be asked to try to cause errors or create inappropriate content in order to better define the boundaries of the platform.

“We will engage decision makers, educators and artists around the world understand their concerns and identify positive cases the use of this new technology,” promised OpenAI.

On the other hand, the company develops tools to detect misleading content. This is a set of features that can classify the videos generated by Sora in order to compare them with other types of videos using AI or real videos. One of these features is the implementation of C2PA metadata, a standard that verifies the provenance of content and related information.
In addition, the security methods already used by other technology products based on Dall-e 3 are currently being used, which, according to what has been said, are also applicable to Sora.

These security methods review and reject text input requests that violate usage guidelines, such as issues involving extreme violence, sexual content, hate images, or personal images. Likewise, they have image classifiers that check the frames of each video to ensure that company guidelines are followed before showing it to the user.

Despite these efforts, necessary based on the evidence, the episode in which OpenAI starred a year after the launch of Chat GPT raises doubts. In late November, OpenAI's board fired Sam Altman, the company's co-founder, for alleged “untruthful communication,” “interference with the normal exercise of his responsibilities,” and the dizzying pace of Altman and his team. Arguments that, viewed from the outside, suggest that there was an atmosphere of distrust towards the person primarily responsible for the advances of a technology that is revolutionizing and challenging the world. Generative artificial intelligence is a highly autonomous system that would outperform humans in the most economically profitable tasks.

Altman returned to OpenAI a few days later, thanks to pressure from dozens of employees who threatened to leave as well. For his return, the young developer demanded the position of CEO of the company and the power to appoint a new board of directors with full confidence.

OpenAI has responded to this discovery by implementing new policies.

The board that fired Altman was not part of OpenAI and was there ad honorem as a body independent of the company's commercial interests and tasked with “ensuring that artificial general intelligence benefits all of humanity.” And under this premise, it is understood that the board has made the decision to remove Altman from his duties.

OpenAI is a for-profit company that is part of a non-profit foundation, a figure that makes altruistic and responsible technological progress more complex, while OpenAI strives for it like any company Make profits and compete in the market with the aim of outperforming the competition.

So far, Sora is the tool that shows the most progress and realism in the development of videos created with generative artificial intelligence. However, Meta, Google and Runway AI, which are working on similar applications called text-to-video (which allow a written idea to be converted into video), are also in this race and have presented examples of their progress.

SUNDAY EDITION
TIME

With information from Europa Press and AFP

More news