Technology is getting closer than ever to decoding our thoughts. Neuroscientists at the University of Texas have for the first time decoded data from non-invasive brain scans and used them to reconstruct language and meaning from stories people hear, see, or even imagine.
In a new study published in Nature Neuroscience, Alexander Huth and colleagues successfully recovered the core of language and sometimes subtle phrases from functional magnetic resonance imaging (fMRI) brain recordings of three participants.
Technology that can create language from brain signals could be extremely useful for people who cannot speak due to conditions such as motor neuron disease. At the same time, it raises concerns about the future privacy of our thoughts.
Decode the language
Language decoders, also called “speech decoders,” aim to use recordings of a person’s brain activity to detect the words they hear, imagine, or say.
Until now, speech decoders have only been used with data from devices surgically implanted in the brain, which limits their usefulness. Other decoders that used non-invasive recordings of brain activity were able to decode single words or short phrases, but not continuous language.
The new research used a blood oxygen level-dependent signal from fMRI scans, which show changes in blood flow and oxygen levels in different parts of the brain. By focusing on patterns of activity in the brain regions and networks that process language, the researchers found that their decoder could be trained to reconstruct persistent language (including some specific words and the general meaning of sentences).
Specifically, the decoder took three participants’ brain responses as they listened to stories, and generated strings of words that likely produced those brain responses. These word sequences did well at capturing the general meaning of the stories, and in some cases included exact words and phrases.
The researchers also had the participants watch silent movies and imagine stories as they were scanned. Either way, the decoder was often able to predict the gist of the stories.
For example, a user thought “I don’t have my driver’s license yet”, and the decoder predicted “she hasn’t even started learning to drive yet”.
Furthermore, when participants are actively listening to one story while simultaneously ignoring another, the decoder can determine the meaning of the story being actively listened to.
How it works?
The researchers started by having each participant lie inside an fMRI scanner and listen to 16 hours of narrated stories while recording their brain responses.
These brain responses were then used to train an encoder – a computational model that attempts to predict how the brain will respond to words heard by the user. After training, the encoder can accurately predict how each participant’s brain signals will respond to hearing a particular string of words.
However, going in the opposite direction – from the brain’s recorded responses to words – is much more difficult.
The coding model is designed to relate brain responses to “semantic features,” or the broad meanings of words and sentences. To do this, the system uses the native GPT language model, which is a precursor to the current GPT-4 model. The decoder then generates sequences of words that may have produced the observed brain responses.
Each ‘guess’ is then validated by using it to predict previously recorded brain activity, with the prediction then compared to the actual recorded activity.
During this resource-intensive process, many guesses are generated simultaneously, and they are ranked in order of accuracy. Bad guesses are discarded and good guesses are kept. The process continues by guessing the next word in the sequence, and so on, until the most accurate sequence is determined.
Words and meanings
The study found that data from multiple, specific brain regions — including the speech network, the parietal, temporal and occipital association area, and the prefrontal cortex — was required to obtain the most accurate predictions.
One of the main differences between this work and previous efforts is the data that is decrypted. Most decoding systems associate brain data with motor features or activity recorded from brain regions involved in the final step of speech output, which is the movement of the mouth and tongue. This decoder works instead at the level of ideas and meanings.
One limitation of using fMRI data is the low ‘temporal resolution’. The signal dependent on the level of oxygen in the blood goes up and down over a period of about 10 seconds, during which time the person may have heard 20 or more words. As a result, this technology cannot detect individual words, but only possible meanings of word sequences.
Don’t panic about privacy (yet)
The idea of technology capable of “reading minds” raises concerns about mental privacy. The researchers conducted additional experiments to address some of these concerns.
These experiments have shown that we do not yet have to worry about decoding our thoughts while walking down the street, or indeed without our intense cooperation.
A decoder trained on one person’s thoughts performed poorly at predicting semantic details from another participant’s data. Furthermore, participants could disrupt decoding by shifting their attention to a different task such as naming animals or telling a different story.
Movement in the scanner can also disrupt the decoder because fMRI is very sensitive to motion, so participant cooperation is essential. Given these requirements, and the need for high-powered computational resources, it is highly unlikely that someone’s thoughts could be decoded against their will at this point.
Finally, the decoder does not currently operate on data other than fMRI, which is an expensive and often impractical procedure. The group plans to test their approach on other, non-invasive brain data in the future.
This article is republished from The Conversation under a Creative Commons license. Read the original article.