Researchers at the University of Texas at Austin have developed a semantic decoder that converts brain activity into text. This AI system, which is non-invasive and does not require surgical implants, could provide a new means of communication for individuals who are unable to physically speak. The decoder is trained by having the participant listen to hours of podcast while on file fMRI scanner, and it can then generate text based on brain activity alone.
A new AI system called a semantic decoder can translate a person’s brain activity — while listening to a story or imagining silently telling a story — into a continuous stream of text. A system developed by researchers at the University of Texas at Austin may help people who are mentally aware but unable to speak physically, such as those who have suffered strokes, to communicate clearly again.
The study published today (May 1) in the journal Natural neuroscience, led by Jerry Tang, doctoral student in computer science, and Alex Huth, associate professor of neuroscience and computer science at the University of Austin. The business is based in part on a transformer model, similar to those running ChatGPT from Open AI and Google Bard.
Unlike other language decoding systems under development, this system does not require people to have surgical implants, making the process non-invasive. Participants also need not only use words from a specific list. Brain activity is measured using an fMRI scanner after intense decoder training, in which the individual listens to hours of audio broadcasts in the scanner. Later, provided the participant was open to decoding their thoughts, listening to a new story or imagining telling a story allowed the machine to generate matching text from brain activity alone.
“For a non-invasive method, this is a real leap forward compared to what has been done before, which was usually single words or short sentences,” Huth said. “We get the model for decoding language that continues for long periods of time with complex ideas.”
The result is not a literal copy. Instead, researchers designed it to capture, albeit imperfectly, the essence of what is being said or thought. About half the time, when the decoder is trained to monitor the participant’s brain activity, the machine produces text that closely (and sometimes accurately) matches the intended meanings of the original words.
For example, in the experiments, a participant who is listening to a speaker say, “I don’t have my driver’s license yet” had his thoughts translated as, “She hasn’t even started learning to drive yet.” Listening to the lyrics, “I didn’t know whether to scream, cry, or run away. Instead, I said, ‘Leave me alone! ‘” decoded as, “She started screaming and crying, and then she just said, ‘I told you to leave me alone'” .
Starting with an earlier version of the paper that appeared as an initial print online, the researchers addressed questions about potential misuse of the technology. The paper describes how the decoder works only with collaborating participants who willingly participated in the decoder training. The results for individuals the decoder had not been trained on were incomprehensible, and if the participants on whom the decoder was subsequently trained had shown resistance—for example, by thinking other thoughts—the results were similarly unusable.
“We take very seriously concerns that it could be used for bad purposes, and have worked to avoid that,” Tang said. “We want to make sure that people only use these kinds of technologies when they want to be, and that it helps them.”
In addition to having the participants listen or think about the stories, the researchers asked the participants to watch four short, silent video clips while they were in the scanner. The semantic decoder was able to use its brain activity to accurately describe specific events from the videos.
The system is currently impractical for use outside the laboratory due to its dependence on fMRI time. But the researchers believe this work could carry over to other, more portable brain imaging systems, such as infrared spectroscopy (fNIRS).
“fNIRS measures where there is more or less blood flow in the brain at different points in time, which turns out to be the same type of signal that fMRI measures,” Huth said. “So this exact kind of our approach has to be translated into fNIRS,” though he noted that the accuracy with fNIRS would be lower.
This work was supported by the Whitehall Foundation, the Alfred P. Sloan Foundation, and the Burroughs Wellcome Fund.
The other co-authors of the study are Amanda Lebel, a former research assistant in the Huth lab, and Shaylee Jain, a graduate student in computer science at the University of Austin.
Alexander Huth and Jerry Tang have filed a PCT patent application for this work.
Frequently Asked Questions
Could this technology be used on someone without their knowledge, for example by an authoritarian regime interrogating political prisoners or an employer spying on employees?
no. The system must be extensively trained on a willing subject in a facility with large, expensive equipment. “A person needs to spend up to 15 hours lying in an MRI scanner, being completely still, and paying close attention to the stories they’re listening to before that really works for them,” Huth said.
Is it possible to skip training altogether?
no. The researchers tested the system on untrained subjects and found that the results were incomprehensible.
Are there ways a person can defend against decoding their thoughts?
Yes. The researchers tested whether a person who had previously participated in the training could resist subsequent attempts to decode the brain. Tactics such as thinking about the animals or imagining calmly telling their story allow participants to easily and completely thwart the system from recapturing the subject’s speech.
What if technology and related research evolved to one day overcome these obstacles or defences?
“I think now, while technology is in such an early state, it’s important to be proactive by enacting policies that protect people and their privacy,” Tang said. Regulating the purpose for which these devices are used is also very important.”
Reference: “Semantic Reconstruction of Continuous Language from Non-Invasive Brain Recordings” May 1, 2023, Available here. Natural neuroscience.