Dealing with the Health Data Deluge

There’s a revolution afoot in medicine. It has been simmering below the surface for the last decade or so, but like many revolutions, you won’t really take notice until it’s all but over. And it’s all thanks to your phone.

The tiny computers are providing patients, doctors, and medical researchers with unprecedented amounts of data. Equipped with a bevy of sensors, a person’s phone can record data even while it sits unused in their pocket. It then automatically beams it back to secure servers for later analysis. By unobtrusively collecting copious amounts of information, our smartphones are enabling researchers to study disease in ways never before possible. For scientists and clinicians, it’s not quite a free lunch, but it may be as close as they’ll ever get.

Merging traditional health records like X-rays with new sources is a significant challenge for the healthcare industry.

Take the mPower app. Unlike with typical medical studies, which struggle with recruiting participants and frequently require testing or observation in a laboratory, the app, which is hosted by Apple’s newly launched ResearchKit platform, uses the iPhone’s sensors to monitor the symptoms of Parkinson’s disease during a person’s waking moments. All participants have to do is say “ah” into the phone for ten seconds, tap their fingers on the screen, and walk with the phone in their pockets. Researchers at the other end of the app can use this information to assess symptoms like voice changes, hand tremors, slowed movements, and difficulties with balance.

Another ResearchKit app, Breast Cancer: Share the Journey, tracks patient recovery after breast cancer treatment using the iPhone’s motion sensors to monitor movement, and it uses surveys and journal entries to collect patient responses about symptoms and outlook. ResearchKit—Apple’s software framework that allows scientists to build these apps—also hosts apps that collect data on diabetes, cardiovascular health, and asthma. The company recently opened up its code to other developers.

While the profusion of data promises to be useful to medical research, it won’t mean much for the rest of us if it can’t be linked to our other health information. Today, for example, if you suffer from heart palpitations, the heart rate data recorded by your Apple Watch can’t be easily linked to the electronic health records (EHRs) kept by your doctor’s office. Your FitBit may be able to sync with your calorie counter, but if you wanted to know how your activity tracked with lab tests your doctor ordered, you’re probably out of luck. The real challenge is getting all of that health information to communicate.

“The data is trapped in silos, and even if you open the silos, it’s kind of a mess,” says Ida Sim, co-founder of a mobile health non-profit and co-director of biomedical informatics at the University of California, San Francisco. “It’s really, really hard to put the data together.”

The 1000 Pound Gorilla

Unfortunately, this is nothing new. In 2004, President George W. Bush appointed David Brailer as the country’s first health information czar to lead the charge from paper to electronic health records. Initially, the program had very little money, but after Brailer’s tenure, this changed with the signing of the Health Information Technology for Economic and Clinical Health Act in 2009. Since then, the federal government has spent $28 billion trying to coax health care organizations into adopting EHRs.

It has been a partial success: as of 2013, 59% of hospitals and 48% of physicians were using “basic EHRs.” But even these modest numbers should be interpreted with caution: they are from a report delivered to Congress in October 2014 by the office of the National Coordinator for Health Information Technology, the same agency overseeing the government’s efforts in EHRs.

President Barack Obama wants to push EHRs even further. He recently announced the launch of the Precision Medicine Initiative in this year’s State of the Union address and requested $215 million dollars from Congress to fund (among other things) the creation of a database containing the genetic, health, and lifestyle data from one million people. If successful, it would help integrate EHRs with today’s abundance of sensor and genetic data.

But former health information czar David Brailer, now the CEO of the investment firm Health Evolution Partners, says, “We have some things to undo if we chose to move in this direction.”

EHRs, even in their simplistic implementation, remain flawed. Different EHR software platforms at different hospitals still don’t play well with one another. The dysfunction may be built into the system: “Allegations continue to surface that some health care providers and health IT developers are interfering with the exchange or use of electronic health information,” says another report from the office of the National Coordinator for Health IT, noting 60 such complaints in the year 2014.

There are exceptions to the rule, and the Million Veteran Program (MVP) is one of them. It launched in 2011 to mine medical information collected from veterans to improve their healthcare. It has already enrolled 345,000 veterans, who have agreed to provide blood, fill out surveys, and allow researchers access to their health records. Some 200,000 veterans have already been genotyped. Andrew Trister, senior physician at Sage Bionetworks called MVP a “tour de force in big data.”

For the rest of the country to follow MVP’s example, Brailer says, “The major priority that the U.S. is returning to now is making that data all connected.” Converting papers to digital is a valiant first step, but when those digital records are trapped wherever they were generated—hospital, pharmacy, or lab—“it simply is not very useful in terms of us understanding the whole picture of care for a person or for a population,” Brailer says. The Precision Medicine Initiative may not be enough to fix the issues that involve vast swaths of the health information technology system—the proposal allocates $5 million toward revamping health IT infrastructure and making sure all EHR systems work together. Brailer estimates that that’s “about $10 billion short.”

“The electronic health record is the 1,000 pound gorilla,” Sim says. “I think we’re going to have a metaphorical deadweight for a long time.”

Trapped in Silos

One way to potentially lift this metaphorical deadweight would be to give patients control over their own records to keep for themselves, or to submit to the research programs of their choice.

There are individual patients who are already collecting this information on their own. Steven Keating, a 26-year-old graduate student in the Media Lab at the Massachusetts Institute of Technology, compiled over 70 gigabytes of his medical information and managed to catch the warning signs of his own brain tumor in time to have it removed (which he filmed).

But requiring patients to recover records from every single lab, doctor’s office, urgent care department, and hospital they’ve ever visited isn’t practical. A few medical centers and systems, including Beth Israel Deaconess Medical Center in Boston and Kaiser Permanente on the West Coast, are trying to make this process easier by adopting a system called OpenNotes, which allows patients to read what clinicians write about them.

The government has also been working on a similar tool, a so-called “Blue Button” that would give patients electronic access to their health information. The Blue Button is partly a functional tool that allows patients to download details about medical treatment, lab results, and insurance claims, and partly an alert: the circular blue button is supposed to remind patients that they have the right to access and correct their own medical information. At the moment, however, its use is limited to patients with government insurance or private insurance companies such as UnitedHealthCare and Aetna.

The absence of universal patient access to health information has already slowed the extraction of insights from big data’s mine. Sage Bionetworks, which helped develop two of the apps currently hosted on Apple’s ResearchKit platform, had originally hoped to incorporate patient health records. “In 2013, we thought that the Blue Button might lead to a really interesting way for people to provide the types of information that were already being collected at big hospitals,” Trister says. However, he says that Blue Button implementation has been “prolonged,” so Sage shifted its focus, asking “what other kinds of information could people have access to that’s easy in the event that it turns out that the Blue Button doesn’t come around?”

So far, it hasn’t, which led Trister and his colleagues at Sage to the mobile sensor and survey data that can be easily collected from study participants’ mobile devices. As of last week, over 2,400 women had signed up for the breast cancer app, and over 14,000 men and women were enrolled in the Parkinson’s Disease study, according to Trister. Trister said in an email that they are in the process of updating the apps to deliver research insights directly to participants via a newsfeed. And he holds out hope that they will one day link the information collected by sensors and surveys to clinical data. “In the next iteration, I’d be very excited to see integration with things like the electronic medical records,” Trister says. “Having a deeper clinical understanding of what is going on could be helpful.”

The Problems of Putting the Data Together

Apple’s ResearchKit launched in March, and just last month, IBM announced that they’ll be tasking their artificial intelligence system Watson with making sense of patient information that’s connected to IBM’s health cloud.

ResearchKit could become another, larger silo of patient information, but Sim thinks that because ResearchKit is open-source—meaning anyone can access and adapt the underlying code—it may avoid the same fate as the EHRs locked into proprietary systems that can’t easily talk with each other. “It’s open source. It thought very carefully about informed consent and data sharing. I think it’s very respectful to patients and to the participants in this study,” Sim says. “All those things start to create a culture where we’re in it together. We’re asking questions together, we’re exploring together, we’re sharing together so that we can learn together.”

Sharing health information, though, raises a number of potential issues because these kinds of electronic data are primed for privacy invasion. Trister cautions that the data collected off of iPhone sensors could be misapplied by employers, for example, or insurance companies. “That’s where I’m very worried about going out and saying these are perfect surrogates for health. I want them to be, and I hope they are, but at the same time they are so easy to collect, particularly in the space of big data, that there could be—I don’t want to say nefarious reasons—but certainly people could start to collect information off of the phones that lead to all sorts of inferences about health in a way that really could impact somebody in a way it shouldn’t,” Trister says.

Even anonymizing data may not be enough. Genetic information, for example, can be re-associated with a person using publicly available information, according to work published two years ago in the journal Science by Columbia professor Yaniv Erlich. “The reason for our study was to highlight that there are gaps in genetic privacy, and we need to advance this discussion about how we can fix and bypass these gaps for the sustainability of genetic research,” Erlich says.

The Payoff

With tech giants like Apple and IBM developing new tools, healthcare may finally be moving toward treating the genetic or environmental cause of an illness, and not just its symptoms—the definition of a “precision medicine,” according to geneticist David Altshuler, executive vice president of global research and chief scientific officer of Vertex Pharmaceuticals.

“In most cases of medical care, we don’t yet know the underlying cause, and the treatments we have to offer are symptomatic,” Altshuler writes in an email. Obama’s initiative takes aim at that first challenge, using large data sets to discover and understand a disease’s underlying cause. But it would be a mistake, Sim says, to anticipate a single, short path to success.

“We’re not going to be coming up with the miracle drug to cure cancer,” Sim says. “I think that the more proper analogy is the way software works, which is agile development,” where incremental changes accumulate over time. It may not seem like much has changed since last year, but changes over the past five years are more significant.

There are exceptions, of course. ResearchKit, Sim says, is a big increment. And putting artificial intelligence Watson to work looking for patterns in huge amounts of patient information may be a big step, too.

But Altshuler adds that collecting health data into one place is just the first step—even after completing the long process of identifying the underlying source of a disease, time and energy will then need to go into creating a cure. Still, many experts seem to think that the investment, if done right, will be worth it. “I cannot guarantee for every person that it will to be transformative for his or her health,” Erlich says. “But in general we like to plant seeds, so if we cannot get the fruit, at least our kids or our grandchildren can get the fruit. This is part of it, that we want a better world.”