Three subjects had to listen to a New York Times podcast and monologues from a popular Anglo-Saxon show while their brains were scanned. With a decoder they constructed, American scientists succeeded in converting the graphs of the brain scans not only into complete sentences, but also into texts that reproduced what they had heard with great accuracy. According to their results, published today in the journal Nature Neuroscience, this so-called ‘semantic’ decoder was also able to put into words their thoughts and, more importantly, what went through their minds while watching silent films.
Since the beginning of the century, and especially in the last decade, great strides have been made in the design of brain-machine interfaces (BCIs). Most wanted people who couldn’t speak or even move all their muscles to communicate. But most of these systems require opening the skull and placing a series of electrodes directly in the brain. Another less invasive approach relies on functional magnetic resonance imaging (fMRI). Here the interface ends in a cap filled with electrodes that is placed on the head. This cap does not record direct neural activity, but rather the changes in blood oxygen levels that it causes. This led to resolution problems. On the one hand by access from outside, on the other hand changes on this level occur in intervals of up to 10 seconds and in this time many words can be said.
To solve these problems, a group of researchers from the University of Texas (USA) has relied on an artificial intelligence system that will look familiar to many: GPT, the same on which the ChatGPT bot is based. Developed by the OpenAI artificial intelligence laboratory, this language model uses deep learning to generate text. In this study, they trained him using fMRI images of the brains of three people who were played 16 hours of audio from a New York Times bureau and The Moth Radio Hour program. That way, they could match what they saw to their representation in their minds. The idea is that when they hear a different text again, the system could anticipate it based on the patterns of what it has already learned.
More information
“This is the original GPT, not like the new one [ChatGPT se apoya en la última versión de GPT, la 4]. We collected a lot of data and then built this model that predicts brain responses to stories,” Alexander Huth, a neuroscientist at Texas University, said in a webcast last week. In this process, the decoder suggests word sequences, “and for each of those words that we think might be next, we can measure how good that new sequence sounds and end up seeing if it fits our brain activity observe.” , details.
This decoder was rightly called semantic. Previous interfaces recorded brain activity in motor areas that control the mechanical basis of language, such as movements of the mouth, larynx, or tongue. “What they can decipher is how the person is trying to move their mouth to say something. Our system works on a completely different level. Rather than looking at the motor domain at a low level, it works at the level of ideas, semantics, and meaning. That’s why it doesn’t record the exact words someone heard or said, but rather their meaning,” explains Huth. The resonances recorded the activity of different brain areas, but concentrated more on those that have to do with hearing and language.
Jerry Tang prepares one of the subjects for the experiments. The sheer size of the scanner, its cost, and the need for the subject to remain still and focused all complicate the idea of mind reading. Nolan Zunk/University of Texas at Austin
After the model was trained, the scientists tested it on half a dozen people who had to listen to different texts than the ones used to train the system. The machine decoded the fMRI images to a close approximation of what the stories told. To confirm that the device worked at the semantic level and not the engine, they repeated the experiments, but this time they asked the participants to make up a story themselves and then write it down. They found a close match between what was decoded by the machine and what was written by humans. Even more difficult, in a third round, the subjects had to watch scenes from silent films. Although the semantic decoder failed here rather on the specific words, it still captured the meaning of the scenes.
Neuroscientist Christian Herff leads research on brain-machine interfaces at Maastricht University (Netherlands) and almost a decade ago created an ICB that could convert brainwaves into text letter by letter. Herff, who was not involved in this new device, highlights the integration of the GPT speech predictor. “This is really cool because the GPT inputs contain the semantics of the language, not the articulatory or acoustic properties like it did in previous ICBs,” he says. He also adds: “They show that the model trained on what is heard can decode the semantics of silent films and also of imagined language.” be used.”
“Your results are not applicable today, you need an MRI machine occupying a hospital room. But what they have achieved, no one has achieved before.”
Arnau Espinosa, neurotechnologist at the Wyss Center Foundation in Switzerland
Arnau Espinosa, a neurotechnologist at the Wyss Center Foundation (Switzerland), last year published a paper on an ICH with a completely different approach that allowed an ALS patient to communicate. For the current one, remember that “His results don’t apply to a patient today, you need an MRI machine worth millions and taking up a hospital room; but what they have achieved no one has achieved before”. The interface where Espinosa intervened was different. “We wanted a signal with lower spatial resolution but high temporal resolution. We could know which neurons are activated every microsecond, and then we could go to phonemes and how to form a word,” he adds. For Espinosa it will end up being necessary to combine several systems and take different signals. “Theoretically it would be possible; much more complicated, but possible”.
Rafael Yuste, a Spanish neurobiologist at Columbia University in New York (USA), has been warning of the dangers of advances in his own discipline. “This research and the Facebook study demonstrate the possibility of decoding speech using non-invasive neurotechnology. It’s not science fiction anymore,” he said in an email. “These methods will have enormous scientific, clinical and commercial applications, but at the same time they herald the possibility of deciphering mental processes, since inner language is often used to think. This is another argument for the urgent need to protect intellectual privacy as a basic human right,” he adds.
With these concerns in mind, the experiment authors wanted to see if they could use their system to read the minds of other subjects. Fortunately, they found that the model trained with one person could not decipher what another person heard or saw. To be sure, they ran a final series of tests. This time, they asked participants to count to seven, think and name animals, or create a story in their heads while listening to the stories. This is where the GPT-based interface, with all the technology that goes into an MRI machine and all the data processed by the AI, failed more than a sideshow shotgun. For the authors, this would show that mind reading requires the cooperation of its owner. But they also warn that their research was based on the samples of half a dozen people. Perhaps with data in tens or hundreds, the danger is real, they concede.
you can follow THEME on Facebook, Twitter and Instagram, or sign up here to receive our weekly newsletter.