OpenAI announced Monday that it will add voice and images to its artificial intelligence (AI) program ChatGPT to make it more “intuitive.”
• Also read: “More docile than humans”: AI-created virtual avatars conquer South Korea
• Also read: Artificial Intelligence: Amazon invests up to $4 billion in Anthropic
The interface that popularized generative AI (capable of producing text, images and other content in everyday language upon simple request) will soon be able to process requests with images and also verbally with its users to chat.
For example, you can take a photo of a monument and “have a conversation with ChatGPT about the building’s history,” or even show the software what’s in your fridge so it can suggest a recipe, OpenAI suggests in a press release.
These new tools will be rolled out over the next two weeks to subscribers of ChatGPT Plus, the paid version of the chatbot, or customer organizations of the service.
The company announced the impending addition of such features last March when it unveiled GPT-4, the latest version of its language model, the technology underlying chatGPT.
GPT-4 is multimedia in the sense that it can process data other than text or computer code.
ChatGPT’s success since late 2022 has led to a major generative AI race between tech giants, with Google and Microsoft at the forefront.
But the rapid introduction of these programs, which are still very poorly regulated, is also a cause for great concern, especially since they tend to “hallucinate,” that is, to invent answers from scratch.
“Models with vision present new challenges, from hallucinations to requiring people to rely on the program’s interpretation of images in important areas,” OpenAI acknowledged in its statement on Monday.
The startup claims to have “tested” the model on topics such as extremism and scientific knowledge and is relying on real-world applications and user feedback to improve it.
It further limited ChatGPT’s abilities to “analyze people” because the interface “is not always accurate and these systems must respect the privacy of individuals.”
The streaming platform Spotify also announced a partnership with OpenAI on Monday to translate podcasts directly with AI.
Programs recorded in English will now be available in other languages, “while retaining the speaker’s distinctive vocal characteristics,” the service said in a statement.
The Swedish company assures that OpenAI’s new speech generation technology “reproduces the style of the original speaker, enabling a more authentic, personal and natural listening experience than traditional dubbing.”