What do you think? Leave a respectful comment.

Data is key to fighting the coronavirus. Here’s why it’s so hard to find

If you wanted to know how many tests for novel coronavirus were administered across the United States yesterday, there would be no single public government source you could check for that information.

In places like South Korea and Taiwan, the government compiles and updates a daily data dashboard that the general population can easily access. But in the U.S., the federal government has increasingly placed the burden on states to test possible cases and keep tabs on potential outbreaks, with little overarching guidance. Without that, public health officials throughout the country have engineered dozens of different ways to count cases, making it harder to know when a community is at risk.

In the aggregate, these differences in data can amount to apples-and-oranges comparisons. They can misrepresent our overall picture of the virus, and interfere with reopening plans and appropriate health interventions, if cases are missed or misrepresented.

Chart by Megan McGrew/PBS NewsHour

And in some cases, when states take on this data collection role, they aren’t always using best practices.
In Florida, Gov. Ron DeSantis wanted to alter the way hospitals tallied intensive care unit beds used to treat COVID-19 patients, according to a June 22 report from the News Service of Florida, which other news outlets have noted could give the impression that fewer beds were needed. In May, after Georgia became one of the first states to reopen, officials attracted criticism when they jumbled the dates for cases so that it appeared as if infection rates were improving. In Mississippi, state officials simply did not release any new case reports for five days. And several states are failing to report lots of demographic data, notably race and ethnicity, for people who have confirmed cases.

Battling a pandemic with data gaps is like trying to put together a puzzle when the pieces don’t quite fit or are missing altogether. As states reopen and Americans are still struggling to figure out how they can safely return to work, school and other parts of their lives, being able to interpret and understand the numbers that do exist has become even more important — and even more complicated.

Here’s what you need to know about finding that data, and what it means.

Where is the federal data?

The Centers for Disease Control and Prevention has largely relinquished its role as a central clearinghouse for virus-tracking data. When cases were first reported in the U.S., it held regular press briefings to update the public every time a new case of COVID-19 was diagnosed. The CDC does release case counts that compile state and local public health department data, but those numbers are widely criticized as an uneven and lagging indicator. The agency has acknowledged gaps in racial and other demographic data, and had race and ethnicity data for fewer than half of confirmed COVID-19 cases in its first congressionally mandated report about the nation’s efforts to mitigate the virus, The Hill reported. And while it publishes a weekly surveillance summary called COVIDView, but case and test data changes daily, if not hourly.

When it publishes county-level coronavirus cases and deaths, the CDC relies on outside help from states, as well as USAFacts.org, a nonpartisan organization founded by former Microsoft CEO Steve Ballmer. (The PBS NewsHour collaborates with USAFacts.org to produce some graphics for livestream interviews.)

Several groups have stepped in to fill the void left behind by the federal government. The same week the virus was confirmed to have arrived in the United States, Johns Hopkins University’s Center for Systems Science and Engineering launched a global map to track the spread of novel coronavirus. Two staff writers for Atlantic, Alexis Madrigal and Robinson Meyer, created The COVID Tracking Project, which collects, cross-checks and publishes data from 56 states and territories about testing, patient outcomes and available race and ethnicity information. And STAT News partnered with developers to produce the COVID-19 Tracker, which mines public and private datasets, including those at Johns Hopkins, the COVID Tracking Project and USAFacts.

While these efforts to pull disparate data into easy-to-read dashboards provides a public service, Dr. Ashish Jha, who directs the Harvard Global Health Initiative, said that job belongs to public health officials, not journalists.

“We used to have a really effective public health agency,” Jha said. “It was called the CDC.”

Fundamentally, the biggest U.S. failure in mitigating the virus has been at the federal level where officials have “ultimately thrown in the towel,” Jha said. The nation has muddled its response and wasted months of preparation, only taking half-hearted steps to regain control, he said.

Years of policy choices have depleted public health officials’ ability to do their jobs, Redfield said Thursday when asked about the country’s plan to deploy more contact tracers or other resources to chase COVID-19. Those choices have a number of consequences, from being limited in data, data analytics or predictive data analysis to plot out where disease has spread or where it might go, or a lack of funding for public health laboratories to process tests or the work force needed to operate them.

“For decades, this nation has underinvested in the core capabilities of public health,” he said.

Which data point is most important right now?

So far, 29 million Americans have been tested for COVID-19 since the pandemic began out of a total population of more than 328 million people, according to The COVID Tracking Project. Of those who have been tested, 2.4 million have been diagnosed with the virus. More than 118,000 people have died.

Chart by Megan McGrew/PBS NewsHour

However, the CDC officials said on Thursday that they believe as many as 20 million Americans have actually contracted the coronavirus — about 10 times more than the number of documented cases. During a telebriefing with reporters, CDC Director Robert Redfield said millions of those people never realized they had it at the time.

Those numbers suggest the virus moved far faster than did testing efforts, which public health experts have long speculated because the United States has taken so long to ramp up testing. For months, people who thought they may have become infected with the virus could not get tested because their travel history or risk based on known exposure did not qualify them to receive tests that were essentially rationed. Those delays rendered basic test and case counts inadequate, since there are likely so many more people who are or have been sick than have been tested.

But the most important number that officials need to chart COVID-19’s movement is the percentage of cases that are positive, according to Dr. Amesh Adalja, an emergency physician and senior scholar at the Johns Hopkins University Center for Health Security.

During the early days of the pandemic, 30 percent of those who were tested in the U.S. were diagnosed with the virus, Adalja said. That’s because tests were restricted unless you had traveled to China or knew you had been exposed to someone with a diagnosed infection. Now, he said, you have to look harder to find a positive case, in part because more people are getting tested.

Today, 10 percent of more than 27 million Americans tested for COVID-19 have been positive since the virus reached the U.S., according to the latest national figures from the CDC. On May 12, the World Health Organization said it is safe for communities to reopen if positive cases had stayed below 5 percent for a 14-day period.

According to analysis from Johns Hopkins University, 22 U.S. states failed to meet that measure, including Texas, which was one of the first states to reopen. Facing an all-time single-day record for new coronavirus cases this week, Gov. Gregg Abbott announced on June 24 that Texans should stay home unless they had to go out, though he said scaling back reopening plans to shut down the state again would be a last resort.

In addition to what percentage of tests are positive, Jha said it is important to keep in mind:

  • the number of cases and how they’re changing (Are they going up or tapering off?)
  • the number of tests and how they’re changing (Are more tests available?)
  • the numbers of hospitalizations and deaths

Put together, these data points offer a composite glimpse of how well the virus is being contained and how vulnerable the country’s health care infrastructure remains to the virus.

The national datascape can present lots of noise, said Dr. Carl Bergstrom, a biologist at the University of Washington who has studied data collection and the ways best practices have (and haven’t) been applied during the COVID-19 pandemic.

If you’re trying to interpret case counts nationwide this week, you’re looking at big decreases in formerly hard-hit New York and increases in more rural states — apples and oranges. That’s why it is important to pay attention to the direction cases are rising or falling based on state and local data, which can shed a lot of confounding variables.

If you look at trends in Washington state versus Florida, Bergstrom said, you may immediately notice big differences between the two states. Washington was one of the first states with a confirmed case, saw the virus spread quickly before quarantine measures were put into effect and made big strides in expanding testing access to monitor additional outbreaks, including offering free tests in Seattle. Florida, on the other hand, was among the first states to reopen and allowed people back onto beaches while many states remained in lockdown.

At this point, you don’t expect to see sudden leaps in Washington’s number of cases, especially if you know they haven’t made radical changes to ramp up testing recently, so if leaps exist in those cases, “upward trajectory is telling you something important,” Bergstrom said. In other words, if testing hasn’t wildly increased but cases are rising, an outbreak may be growing. Right now, he said, he is concerned about a pattern of increases in the number of new cases he has seen pop up across the Southeastern U.S. “It’s unlikely that they’ve all simultaneously ramped up testing, ” which is one cause for a sudden spike — which means the increase is more likely an indication that the virus is spreading.

So those increases, along with reported spikes in new cases in Texas and Arizona, are likely “the entirely predictable consequence of reopening too quickly with insufficient testing and insufficient distancing measures in place,” Bergstrom said.

Why lags exist in data collection for COVID-19

Patients may learn their COVID-19 test results in a matter of hours or days, but public health departments typically wait days or weeks before receiving those same results. When results finally arrive, public health departments must sift through and make sense of (often incomplete) test result data recorded on paper, rather than electronically delivered, according to Dr. Janet Hamilton, who directs the Council for State and Territorial Epidemiologists. Some departments spend critical time digitizing paper results when they should be recorded and delivered that way in the first place, she added.

A person’s demographic data, phone number or home address may be left blank, and when hundreds or thousands of test logs are missing data, the work of tracking down those blanks add up. The absence of these data points can slow down contact tracing and determine how well epidemiologists map out cases and detect potential outbreaks in their community. While public health departments are skilled detectives, Hamilton said the U.S. has “allowed the public health process to become a piecemeal process.”

“This is an illness that moves with speed and intensity,” Hamilton said. “It’s how we will be able to make progress. Using data and information that’s a week and two-weeks-old to make decisions makes a difference. It means we won’t make the best decisions.” During a pandemic, these kinds of data points should instantly arrive for public health departments to analyze, she said.

As Baltimore’s former public health commissioner, Dr. Leana Wen said she believes local health officials know their communities best. But Wen, an emergency physician at George Washington University Hospital, said local officials need the federal government to provide guidance on best practices so local officials aren’t forced to wade through hundreds of studies to develop strategies for how to protect their communities.

“We need to empower locals to do the job only they can do,” she said.

Does more testing make more cases?

During a June 20 rally in Tulsa, Oklahoma, President Donald Trump suggested that testing efforts amid the COVID-19 pandemic should actually be stifled. “When you do testing to that extent, you’re going to find more people, you’re going to find more cases. So, I said to my people, ‘Slow the testing down, please.’” It wasn’t the first time he had suggested that if the U.S. scaled back testing, then the country’s caseloads would drop. On June 15, Trump told reporters that “if we stop testing right now, we’d have very few cases, if any.”

That’s not how it works. In a House hearing Tuesday, Dr. Anthony Fauci, one of the nation’s top infectious disease experts, told lawmakers that “it’s the opposite. We’re going to be doing more testing, not less,” Fauci said.

More tests will alert you to the presence of more COVID-19 cases, but more tests doesn’t correlate to the virus’ spread, said Bergstrom, co-author of a forthcoming book, “Calling Bullshit,” that explores how data can be spun to produce misinformation or even outright lies.

“The monster is there even before you turn on the light,” he said.

In the book, Bergstrom and fellow author, Dr. Jevin West, who directs the Center for an Informed Public, explain how data can be corrupted and numbers manipulated to serve an agenda.

“To tell an honest story, it is not enough for numbers to be correct,” the authors write. “They need to be placed in an appropriate context so that a reader or listener can properly interpret them.”

Chart by Megan McGrew/PBS NewsHour

That is not happening as easily during a pandemic when the public and government need these data most and the virus has already overstretched state resources. With so many different ways to count cases, and pressure to reopen state economies, it can be hard for states and the public to know how to move forward.

Since the way data is collected varies greatly across the country, all of the variables — how much testing is taking place, who is getting tested, how those testing results are reported and what lags emerge — “make it very, very hard to be able to draw reasonable inferences from the data we have about what’s happening,” Bergstrom said.

“We know we’re not catching all the cases,” he said. “We know there are a lot more cases than are being tested positive, so what’s more useful is looking at how those numbers are changing.”