Support Provided ByLearn More
Body + BrainBody & Brain

Using artificial intelligence, scientists translate brain signals into speech

With a “virtual vocal tract,” researchers might someday be able to help people who have lost the ability to speak.

ByKatherine J. WuNOVA NextNOVA Next

By placing an array of electrodes on the brain, researchers can translate electrical activity into intelligible speech. Image Credit: University of California, San Francisco

With the development of a decoder that translates brain signals into synthetic speech, a team of neuroscientists has now taken crucial steps toward someday restoring the voices of people who have lost the ability to speak.

The new technology, described in a study published today in the journal Nature, has yet to leave the lab. If successful, however, a similar brain-machine interface could be a major windfall for those who have no means of producing language, such as patients who have suffered strokes, throat cancer, amyotrophic lateral sclerosis (ALS), or Parkinson’s disease.

In its current iteration, the decoder relies on monitoring brain regions that are active during conscious vocalizations—meaning it won’t pick up on unspoken thoughts. It also hasn’t yet been tested in patients who don’t have the ability to speak.

Regardless, because it seems to successfully eavesdrop on how the brain dictates speech, the research represents a “huge advance,” Kate Watkins, a cognitive neuroscientist at the University of Oxford who was not involved in the study, told Hannah Devlin at The Guardian.

Support Provided ByLearn More

This is far from the first computer-based effort to recapitulate speech—but previous efforts, some of which relied on the reading of facial movements or painstakingly typed out words letter by letter, maxed out at a rate of about eight words per minute. Natural spoken speech, on the other hand, averages 150 words per minute.

To bridge this gap, a team led by Edward Chang, a neurosurgeon at the University of California, San Francisco, decided to tackle the process further upstream—by listening in on the brain itself.

“The brain is the most efficient machine that has evolved over millennia, and speech is one of the hallmarks of behavior of humans that sets us apart,” study author Gopala Anumanchipalli, a neuroscientist at the University of California, San Francisco told Michael Greshko and Maya Wei-Haas at National Geographic.

When a person speaks, the vocal tract acts as an ambassador between brain and speech, translating electrical activity from the motor cortex into the movements of the lips, tongue, and jaw. For many patients who have lost the ability to vocalize, the brain’s speech centers are still intact, but the machinery that makes sounds is no longer functional. So the researchers designed a computer program to mimic the decrypting abilities of the vocal tract.

To tune into the brain’s inner electrical monologue, the researchers hooked electrodes directly up to the brains of five people while they read sentences aloud. They then fed the data to a computer program that matched patterns in the brain’s signals with the vocal movements they would produce, like pressing the lips together. Finally, the algorithm turned the movements into fast-paced synthetic speech.

The sounds produced by the program were passable enough to be roughly transcribed by hundreds of listeners, who could understand, on average, 70 percent of the words spoken. But the results varied widely: Depending on the complexity of the syllables and words in each sentence, some phrases came through perfectly each time, while others were almost universally unintelligible.

Clearly, the synthetic speech is still a far cry from what humans can produce. The technique is limited to cases in which the speech centers of the brain haven’t been damaged, which can happen after traumatic brain injuries. Additionally, the method still relies on opening up the skull to implant electrodes in the brain—something that won’t be practical or safe on any large scale.

The study’s results are also based on a program that was trained only on people who retained the capability to speak—which means there’s no guarantee that it will work for patients who have lost their voice, or the ability to move their face and tongue.

But Chang and his team remain hopeful. When the researchers tested their program on a participant who silently mouthed sentences, it was still able to generate synthetic speech. However, intelligibility suffered.

All in all, there’s a lot to address, Marc Slutzky, a neurologist at Northwestern University, told Giorgia Guglielmi at Nature News. The study still represents “a really important step,” he said, “but there’s still a long way to go before synthesized speech is easily intelligible.”

Receive emails about upcoming NOVA programs and related content, as well as featured reporting about current events through a science lens.

Funding for NOVA Next is provided by the Eleanor and Howard Morgan Family Foundation.

National corporate funding for NOVA is provided by Draper. Major funding for NOVA is provided by the David H. Koch Fund for Science, the Corporation for Public Broadcasting, and PBS viewers. Additional funding is provided by the NOVA Science Trust.