Support Provided ByLearn More
Tech + EngineeringTech & Engineering

To predict the next infectious disease outbreak, ask a computer

Mathematical modeling and AI can pick out patterns preceding epidemics that human brains can’t readily discern.

ByKatherine J. WuNOVA NextNOVA Next

Lyle's flying fox (Pteropus lylei), one of several bat species known to carry Nipah virus, which can serious illness in people. Animal health and human health go hand in hand, researchers say. Image Credit: asawinimages, iStock

On December 26, 2013, a two-year-old boy named Emile Ouamouno fell ill in the village of Meliandou in Guinea, West Africa. For two days, his tiny body was wracked with fever as he vomited and passed black stool. By December 28, he was dead.

Within weeks, his sister, mother, and grandmother were, too—the first casualties of what would eventually become thousands. The largest Ebola outbreak in history had begun.

It would be many months, however, before a team of researchers would pinpoint the probable source of the epidemic: a colony of Angolan free-tailed bats (Mops condylurus) that had roosted in a hollow cola tree less than 200 feet from Emile’s home. The locals called them lolibelo, or flying mice, for their distinctive smell and long tails. They were common targets for children at play, who would rouse them from sleep with sticks and roast them as snacks.

By the time ecologists and veterinarians arrived in Meliandou in April 2014, the tree in question had been burned and the bats were long gone. But the winged, mouse-sized mammals are still considered some of the likeliest candidates for Ebola’s so-called animal reservoirs, maintaining the virus in the wild before it makes each of its fateful hops into humans.

The disease’s devastating trajectory is a familiar one. Ebola, like SARS, Lyme disease, HIV, and most of the other infections known to plague people, got its start in another species. It’s in these creatures that pathogens can also hide between epidemics, biding time before they re-emerge.

By the time most reservoir species are identified, the pathogens in question have already spilled over into people. In the wake of an outbreak, that leaves just one course of action: mitigation—the frantic attempt to halt the collapse of a line of dominos after they’ve already begun to fall.

For Barbara Han, a disease ecologist at the Cary Institute, this reactive approach isn’t enough. “The fact is, you’ve already waited until people got sick,” she says. “You don’t want to hand out umbrellas after the rain falls. You want to forecast the rain before it starts.”

To get ahead of the curve, researchers need better tools that can predict these outbreaks before they happen, says David Redding, a disease ecologist at University College London. That means searching for the ecological and epidemiological patterns that precede spillovers—the harbingers of outbreaks that flicker to life in the wild, then trickle into humans.

Most of these warning signs aren’t readily discernible by human brains alone. So scientists like Han and Redding have turned to computational models that can scour gobs of ecological and demographic data in their stead, hunting for clues to where the next infectious leak might spring.


The border between Guinea and Liberia, during the 2014 Ebola outbreak. Frequent boat travel, as well as other modes of transport, can help Ebola virus hop from one country to another. Image Credit: CDC Global, flickr

Into the wild, and back out again

Out in the wild, infection is a given—a reality Barbara Han, who got her scientific start in animal ecology, became intimately familiar with while tracking fungal pathogens in amphibians.

But the bugs that lurk in wildlife don’t always stay there.

Of the new and emerging infectious diseases documented by the Centers for Disease Control (CDC) over the past couple decades, 75 percent are zoonotic, or capable of spreading from animals to humans. And while scientists have amassed a good deal of data on animal reservoirs over the years, they’ve long struggled to uncover the crucial commonalities among them—the traits that make a species ideally suited to pass a pathogen to people.

That’s why Han has turned to a tool that could accomplish what human researchers can’t on their own. Several years ago, she and her team trained a computer model to pick out new rodent species with high disease-carrying potential, based on the traits they shared with 217 previously identified carriers of disease. Han compares the approach to Pandora’s strategy for recommending songs: An algorithm learns the trends that dictate musical taste or vulnerability to infection, then offers up a comparable band or animal that hasn’t been considered before. Using this tactic, Han’s model scanned through the 2,277 rodent species that exist worldwide and homed in on 58 not previously designated as reservoirs.

The list was diverse, spanning much of the rodent family tree. But its members did seem to have a couple things in common, like brief lifespans, early sexual maturity, and large numbers of offspring. “These rodents basically have a ‘live fast, die young’ approach to life,” Han says. It’s possible they prioritizing reproduction over other resource-heavy pursuits like, say, an ironclad immune system, she says. But unlike rats and mice, long-lived, slow-maturing humans have more to lose by ignoring infections.


A northern grasshopper mouse (Onychomys leucogaster), one of several species pinpointed by a machine learning model trained to detect rodents at risk of carrying infectious diseases. Image Credit: Weber, iStock

Predictions aren’t guarantees. And it’s likely that many of these disease-carrying candidates will never harbor a problematic pathogen at all. But when done well, “modeling studies are great for hypothesis generation—they demonstrate what could happen,” says Inger Damon, Director of the CDC’s Division of High-Consequence Pathogens and Pathology.

And in certain instances, they seem spot on, Han says. While her team’s paper was being prepped for publication, two of the voles on their computer-generated list were confirmed to harbor parasites.

A similar story has played out in other animal groups, too. In 2016, Han and her colleagues published a list of bat species that could play host to filoviruses, the group that includes Ebola. Less than a year later, a team of virus hunters uncovered filoviruses lurking in China’s fruit bats, including a couple species from Han’s paper.

Around the same time, a colleague at Columbia University phoned Han, bursting with excitement: He’d discovered a new ebolavirus in Sierra Leone. It wasn’t yet clear if the virus could cause disease in humans, but it had been detected in two types of bats. One of them was the Angolan free-tailed bat—another species high on Han’s list of potential reservoirs. The same species suspected of infecting Emile Ouamouno years before.

Bridging the divide

Reservoirs aren’t reservoirs until they’re tapped. For a disease to jump into a human population, it needs access—a region where infected animals and people overlap.

At University College London, David Redding and ecologist Kate Jones have taken their own computational approach to uncover the dynamics of infection at these ports of entry. Their newest model, described in a paper published today in the journal Nature Communications, is what Redding calls a hybrid approach, borrowing from both ecology and epidemiology to predict areas at high risk of Ebola spillover and subsequent outbreak in Africa.

“We know where animal hosts are,” he says. (In Ebola’s case, that almost certainly means bats, and possibly great apes and duikers—a type of antelope—as well.) “And we also know where people are. Where you have both, you have likely contact, and risk of disease.”


Workers in Guinea, West Africa, shortly after Ebola was confirmed to have hit the region. Image Credit: EU Civil Protection and Humanitarian Aid, flickr

That might sound simple enough. But a bevy of other variables make dynamics of a spillover far more complex, Redding says. Land use, for instance, can have a big impact on a reservoir’s range, and how much the members of different populations mix. On the human side, the size of an ensuing outbreak depends on connectivity—how easy it is for people to physically get around and mingle with others—and regional wealth, which often dictates the amount of money allocated to health care.

From the virus’ perspective, “the ideal situation would probably include a human population situated in a forested area with animal hosts, near a big transportation hub, near a big city,” Redding says. “That’s where you would expect large outbreaks to occur.”

Support Provided ByLearn More

While these are many of the critical variables that affect zoonotic disease, Damon says, there are always more variables to consider. Only some spillovers turn into outbreaks. And the likelihood of that transition can hinge on aspects of human behavior that Redding’s model didn’t capture, like the prevalence of funerary practices that may increase contact with infected bodies, she says.

By definition, computational modeling will always be a bit reductionist, says Sadie Ryan, a disease ecologist at the University of Florida. Programs have to accurately and efficiently capture the complexities of the real world with a limited set of data. That’s a huge challenge—and a high stakes one, she says. “If you’re doing massive spatial computational simulations without real information, you’re just making video games.”

But models like these, which take animals, humans, and their environments into account, effectively capture the “biological realism of these spillover events,” Ryan says.

large (1).jpg

The largest Ebola outbreak in history may have begun when an Angolan free-tailed bat (Mops condylurus) passed the virus to a toddler in Meliandou, Guinea, West Africa in 2013. Children in the village often roused bats from tree hollows to play with cook eat them. Image Credit: Jakob Fahr, iNaturalist

In its current iteration, Redding’s model has proven powerful. With the data it was fed, it correctly identified several areas that had already experienced Ebola outbreaks, such as the Democratic Republic of Congo (DRC), Gabon, and regions in West Africa hit by the epidemic that began in Meliandou.

When the simulation originally ran in 2018, it also flagged several other regions—including Nigeria, Ghana, Rwanda, and Kenya—that, at the time, had been mostly untouched by the virus. In the months since, two of its outbreak predictions in the DRC have come true.

Cloudy with a chance of infection

West Africa’s Ebola epidemic ended in June of 2016. In the two and a half years after Emile Ouamouno fell ill in Meliandou, at least 28,646 people had been infected and at least 11,323 had died—more than all previous Ebola outbreaks combined.

The virus has since re-emerged. And with so many available hiding places in the wild, it’s likely to do so again, Redding says.

This is where outbreak predictions can be powerful, Han says. They can inform where health care resources are diverted next, or how ecologists and conservationists can protect and monitor (rather than villainize) reservoir species in their natural habitats, she says.

Acting on the numbers churned out by these models, however, is another issue entirely, Damon says. Predicting spillover isn’t the same as preventing it—a process that requires increased surveillance, or an infusion of resources that can quash outbreaks before they have a chance to grow.


An infection control supervisor (left) demonstrates proper hand washing techniques during a field supervision visit to a small clinic in N'Zérékoré, Guinea. Image Credit: Lindsey Horton, CDC Global, flickr

These interventions will become increasingly complicated to execute in a rapidly changing world, Ryan says. As temperatures rise and habitats disappear, reservoir species—among many others—will be forced to uproot and adopt new behaviors, rejiggering their potential to transmit disease. “Climate change impacts literally everything,” says Han, who’s now collaborating with researchers at NASA to incorporate climate data into her team’s predictions.

In the case of Ebola, one trend may already be clear: The worse climate change gets, the more outbreaks we’ll have, Redding says. Projecting into the year 2070, his team’s simulations show that warmer, wetter conditions will raise the risk of spillovers across the African continent. Some of these effects can be mitigated by reducing carbon emissions and increasing sustainable development, Redding says—but only if the world takes action soon.

“This is about getting the concept of intervention out there ahead of time,” Ryan says. “If this is how the future is unfolding, let’s be there before it happens. And let’s be ready.”

Receive emails about upcoming NOVA programs and related content, as well as featured reporting about current events through a science lens.

Funding for NOVA Next is provided by the Eleanor and Howard Morgan Family Foundation.

National corporate funding for NOVA is provided by Draper. Major funding for NOVA is provided by the David H. Koch Fund for Science, the Corporation for Public Broadcasting, and PBS viewers. Additional funding is provided by the NOVA Science Trust.