This story was originally published by STAT News on Dec. 21, 2020. You can find the original article here.
Stanford found itself in hot water last week after deploying a faulty Covid-19 vaccine distribution algorithm. But the fiasco offers a cautionary tale that extends far beyond Stanford’s own doors — and holds crucial lessons as the country prepares to confront complex decisions about who gets the vaccine, when, and why.
At the center of the debacle was a rules-based formula designed to determine the order in which the thousands of medical workers at Stanford should be vaccinated. The tool took into account employee-based variables like age, job-based variables, and public health guidance, according to MIT Technology Review. But flaws in that calculation meant hospital administrators and other employees working from home were toward the front of the line, while only seven of Stanford’s 1,300 medical residents made the list.
Experts told STAT what went wrong appears to be a story of unintended consequences, which often arise at the intersection of human intuition and artificial intelligence. Here are a few key points to consider about the incident and the broader issues it reflects.
Blame the humans, not the algorithm
In its earliest attempts at explaining the problem, Stanford’s administrators laid blame with the algorithm. Despite best intentions, they explained, the algorithm had made a mistake that the humans had to answer for.
This is a bit like blaming the hammer for missing the nail.
Experts told STAT that this was a human problem from start to finish. Asserting otherwise compounds the problem by implicating all algorithms without understanding how the use of this one went awry.
“To me this appears to be a case of well-meaning humans wanting to be guided by data and making an honest mistake,” said Nigam Shah, a professor of bioinformatics at Stanford. “We should use this as a learning opportunity rather than to stoke more outrage.”
Critically, Stanford’s algorithm was not powered by machine learning, in which the computer learns from the data without explicit programming by humans. Rather, it was rule-based, as explained by MIT Technology Review, which means that humans wrote out a set of instructions that the tool simply acted upon.
The inescapable conclusion is that something went wrong with those instructions. But what was it? And why weren’t those problems caught and corrected before the tool was put into use? Those are fundamental questions for the people involved, not the tool they used.
Julie Greicius, Stanford’s senior director of external relations, did not respond to questions from STAT including what went wrong with the algorithm, but said the university quickly revised its vaccine distribution plan to prioritize health workers including residents and fellows. Stanford also created a new committee that would consider the interests of all of its stakeholders, she said.
“We are optimistic that all our frontline healthcare workers will be offered the vaccine within the next two weeks,” Greicius added.
Beware of structural bias in the data
In building an algorithm to decide which staff to protect first, Stanford would have to have decided which was more important: prevent deaths from Covid-19, or stopping infections from the virus. Depending on which result they wanted to prevent, the algorithm would take a variety of important considerations into account — including age, job title, and theoretical risk of exposure to Covid-19 — but might weigh them differently.
The algorithm seems to have been seeking, overall, to avoid death rather than infection. For that reason, it would give extra weight to factors like age and less weight to factors like theoretical exposure.
Complicating matters further, the tool appears not to have accounted for workers’ actual exposure to the virus and changes to hospital rules and protocol during the pandemic, several experts and one Stanford fellow reasoned.
“I think it was designed with the best intentions,” said Jeffrey Bien, a Stanford oncology fellow, “But there are hard decisions to make. If you’re designing the algorithm from the standpoint of: prevent as many deaths as possible, that would be different than trying to prevent as many infections as possible.”
Take, for example, a 68-year-old chief clinician who normally takes care of patients in the hospital, but is seeing patients remotely during the pandemic. Their age and normal job requirements would theoretically put the clinician at high risk of the virus. But given the circumstances, the clinician would have virtually no physical interaction with potential Covid-19 patients and little resulting exposure to the virus.
On the flip side, medical residents, fellows, and trainees would be largely considered at lower risk because of their age and job requirements during non-pandemic times.
But accounting for the fact that these younger residents are now interacting with dozens of Covid-19 patients each day renders that theoretical risk useless. Far more important is their actual risk — the real likelihood, based on these interactions, that they will become infected with Covid-19.
Still, if Stanford’s algorithm was indeed programmed to avoid deaths, many frontline staffers — despite their disproportionately high risk of exposure to Covid-19 — would find themselves at the back of the line when it came time to distribute the vaccine because of their age.
“There’s a difference between your theoretical population and the population you actually run the (algorithm) on,” said Andrew Beam, an artificial intelligence expert and professor of epidemiology at the Harvard T.H. Chan School of Public Health. “You’re right to think older people are at risk, but if those older people aren’t actually taking care of Covid patients, you have to account for that, and that seems to be the fundamental mismatch here.”
Validate algorithms before deploying them
Because this was a straightforward rules-based algorithm, Beam said, Stanford’s developers may have assumed it would produce the result they intended. After all, they understood all the factors the algorithm was considering, so of course it would fairly prioritize people for vaccination.
But the way to know for sure is to test the algorithm before it is deployed.
“They could have sent out an email saying, ‘Here is our vaccine allocation tool, would you mind putting in your job, age and level of training — then they could very quickly see what the allocation would look like,” Beam said. “They would have said, ‘Oh my God, we will have vaccinated five of our 1,300 residents.”
Such auditing is a crucial step in the development of AI, especially in medicine, where unfairness can undermine a person’s health as well as their trust in the system of delivering care. The insidious thing about bias is that it is so difficult for people to see, or police, within themselves. But AI has a way of making it plain for all to see.
“The problem with computers,” Beam said, “is that they do exactly what you tell them to do.”