There’s Hope for fMRI Despite Major Software Flaws

Over the last 25 years, fMRI has developed into a powerful technique to look at the brain. But a recent analysis of fMRI software found a bug that jeopardizes 40,000 studies which use the technique.

Functional magnetic resonance imaging, or fMRI, is used to highlight which portions of the brain are active during certain tasks. The brain scans they produce are so large and complex that they require statistical software to find patterns within the data. But the three most common types of software used frequently produces false-positives, according to a recent paper in PNAS. Led by Dr. Anders Eklund, an associate professor of medical informatics at Linköping University in Sweden, the study showed that up to 70% of the time the software reported brain activity where there was none.

An fMRI scan showing different active areas of the brain.

Ivo Dinov, an associate professor at the University of Michigan and director of the Statistics Online Computational Resource, called the study a “Herculean task,” and he was quick to point out that “the report certainly does not invalidate the studies. I want to make it crystal-clear that it does not discredit, specifically, any of these studies.”

The study only addresses large scale fMRI studies—those looking to find patterns within a particular group of people. For example, scientists might compare the brain scans of people diagnosed with schizophrenia to the general population, comparing what regions of the brain respond to visual stimuli. Individual brains have a lot of natural variation, but using fMRI images from many individuals help scientists find patterns, which can indicate the neural causes of diseases like schizophrenia.

When an fMRI machine scans a person’s brain, it’s looking for the extra blood that flows to a firing cluster of neurons and marks it on a map of the brain. Researchers use statistical software to analyze the resulting map and determine which areas of the brain are active. It’s in this software, not the fMRI itself, where the problem lies.

Eklund and his team collected 499 fMRI scans of brains in a resting-state, which happens when a person isn’t performing a particular task. Collectively, the scans should look random, with each person’s brain lighting up in a different location. But the three most common types of fMRI software-—those used in up to 90% of fMRI studies—showed false patterns up to 70% of the time. Acceptable error is usually limited to 5%.

More specifically, the team of researchers randomly split the fMRI scans in small groups of 20. Using one group as a control and another as experimental data, the researchers compared the two groups. They repeated this process 3 million times. Since the scans are randomly drawn, they should rarely exhibit patterns, far less often than Eklund and his team observed.

Roughly 40,000 fMRI studies have used these flawed statistical methods, according to the recent analysis. This does not indicate the findings of these studies are wrong. But “it suggests, potentially, that some of these might not be reproducible,” Dinov said. An fMRI study, repeated ten times, might only yield the same results three or four times. This, Dinov said, “suggests caution by investigators in interpreting their results.”

Researchers aren’t certain what causes the software to produce false-positives, but it’s probably the result of incorrect assumptions the software is making. For example, the software makes educated guesses the size and shape of neuron clusters that light up together. It might assume they form a perfect sphere when the actual shape is lumpy and irregular. It’s a balancing act—without using any assumptions, the software wouldn’t be able to do it’s job. But with too many assumptions, it produces errors.

Despite these findings, fMRI remains an important imaging technique, and statistical software is needed to analyze it, Dinov said. “People should continue using it,” he said. But, he adds, they should be wary that “if you use state-of-the-art techniques blindly, then you have a high rate of identifying effects when there aren’t any.”