Language modeling uses various statistical and probability techniques to predict the word sequence that will appear in a sentence. These models are often used in natural language processing applications that generate text as output. A notable example is the AI model, which has been trained to predict the following words in a text string based on the previous words. This technology helps search engines and SMS apps predict the next word before the user types it in. Implementation of this technology is not limited to forecasting, but has been found to be helpful in answering questions, summarizing documents, and completing stories.
Although the models were designed to predict the next word in a text, a new study by MIT neuroscientists shows that the functioning of these models is similar to the functioning of language processing centers in the human brain. It is observed that computer models that perform other language tasks have nothing in common with the human brain. This provides evidence that the human brain uses the next word prediction to power language processing.
The recently developed models belong to a class called Deep Neural Networks, a category of machine learning based on the organization and activities of the human brain. These networks contain compute nodes that create connections of different strengths and layers that pass information between one another in a fixed manner.
Over the past decade, scientists have developed models that perform object recognition as efficiently as the primate brain. MIT researchers have also shown that visual object recognition models work in a similar way to the structuring of the visual cortex of primates.
The researchers compared 43 different language models with the language processing centers in the human brain. One such predictive model for the next word is the Generative Pre-trained Transformer 3, abbreviated as GPT-3, which generates text that is similar to what a human would produce if prompted.
Researchers introduced a series of words to each model and measured the activity of the nodes that make up the deep neural network. Three language tasks were considered to draw parallels in how the human brain works, including:
- Hear stories.
- Read one sentence at a time.
- Read sentences that reveal a word at a time.
Human data sets consisting of functional magnetic resonance (fMRI) data and intracranial electrocorticographic measurements were collected from individuals undergoing brain surgery for epilepsy. Performance parameters such as the reading speed of a particular text are used to perform the comparative analysis. The best performing next word predictive model is observed to have patterns similar to those recognized in the human brain.
One of the great features of the GPT-3 predictive model is an aspect called a forward one-way predictive transformer that can make predictions based on previous sequences. A key feature of this transformer is that it can make predictions based on a very long previous context, not just the previous word.
An important finding from this study is that speech processing is a very limited problem and a major difficulty is a real-time aspect. The idea of the AI network wasn’t to mimic how the human brain worked, but it ended up being a brain-like model. This suggests that a convergent evolution has taken place between AI and nature.
The researchers suggest creating variants of these models in the future and assessing how a slight change in design would affect their performance and suitability for human neural data. The idea is to use them to understand how the human brain works. The subsequent action in the trajectory is to integrate the powerful language models with previously developed computer models to enable them to perform complex tasks such as constructing perceptual representations of the physical world.
The aim is to get closer to more efficient AI models that explain exactly how other parts of the brain work and understand how intelligence is created and make a comparison with the past.