What do you think? Leave a respectful comment.

Can big data save these children?

Humming away in a brick building near the banks of Pittsburgh’s Monongahela River, two servers filled with personal data hold the potential to improve the lives of the state’s most vulnerable children.

Harnessing what’s on these servers would represent an ambitious use of big data, one that could possibly safeguard thousands of kids from abuse and neglect and transform a foster care system in need of help. But tapping into that data could come at a cost.

On March 9, 1994, police found a toddler in a hotel bed in suburban Pittsburgh. She had been dead for more than 24 hours. The 2-year-old, Shawntee Ford, had 52 injuries, including a bruise so deep and massive that a pathologist could only compare it to injuries suffered by fatal car crash victims. Her organs were lacerated and had hemorrhaged. Her tiny left wrist had snapped.

Her father, Maurice Booker Sr., had beaten his daughter to death because she cried, the Pittsburgh Post-Gazette later reported. At his trial, Booker begged the state to kill him.

Just one month earlier, on Feb. 2, a family court judge in Pittsburgh had awarded Booker custody of his daughter, who was previously in foster care. But there was a critical oversight. County child welfare workers had never completed Booker’s background check nor had they provided records of his violent criminal history.

The child’s death had cascading effects on Allegheny County’s child protective services department. An investigation concluded that the county’s actions were partly responsible for the child’s murder. Meanwhile, Mary E. Freeland, Allegheny County Children and Youth Services director, resigned and moved to Florida. In 1996, a national search committee hired her replacement: Marc Cherna.

It was the mid-1990s, and the average caseworker struggled to manage more than 30 foster care cases each, exceeding state-mandated safety limits. It was common for foster children to age out of the system on their 18th birthday without reuniting with their biological families. These children would disproportionately grow up to become homeless, in jail or addicted to drugs. The public had little trust in the agency.

“When I came here, we were known as a national disgrace,” said Marc Cherna, who had just arrived from New Jersey’s Department of Human Services.

Children in foster care face a grim set of statistics. Just half of foster children graduate from high school by age 18, and fewer than 10 percent earn a Bachelor’s degree. One national study found that about one in five foster youth will become homeless after age 18, and one in four will spend time behind bars less than two years after aging out of foster care.

When Cherna first asked Allegheny County residents to share their thoughts on their foster care system, people packed into standing room-only meetings to participate. They hurled insults at Cherna and his staff, calling them “child-snatchers,” he said. You took my kids, someone told him. “Now I can’t get them back.”

Restoring systems and trust

Cherna had a theory. He believed society’s most complex problems could be tackled with the least forgiving of tools: data.

Before 1996, Allegheny County used computerized systems to keep track of bill payments and basic client information. But only paper records existed for the more than 3,000 kids in foster and group homes. It was an outdated system.

When Cherna arrived, he embarked on a series of reforms. His team whittled away at the county’s five-year backlog of adoption cases, with the help of a law firm that worked on the project pro bono.

They built a giant server that ultimately united 29 individual systems onto a single platform. Among them, SNAP food stamps, public safety and housing and unemployment benefits. A data warehouse, they called it. It meant caseworkers could see if their clients lived in public housing, received unemployment benefits and food stamps, or had grown up in foster care without spending time running down records from other offices.

Meanwhile, Cherna attended churches and town hall meetings and visited the local NAACP chapter. When local talk radio hosts put him on air, phones rang off the hook. He set up a phone line and staffed it with experienced caseworkers. He distributed a parental book of rights to families whose children were in foster care. And slowly, he started to earn the public’s trust.

At the same time, in a cost-cutting move, the county’s Office of Children and Youth Services, which oversaw foster care, became part of a brand new department, the Department of Human Services. Cherna was promoted to direct it.

Cherna brought to Allegheny County a culture of transparency and openness, said David Sanders, who oversees systems improvement for Casey Family Programs, one of the nation’s largest foster care organizations. He embraced data and encouraged staff to do the same, he added, laying the groundwork for more evidence-based decision-making.

By 2013, Allegheny County Department of Human Services served more than 200,000 residents on an $800 million budget.

From 2004 to 2014, the number of children in foster care nationwide dropped by about 20 percent, according to the most recently available federal data. Kim Stevens, project director for Advocates for Families First, which supports adoptive and foster parents and kids, attributes improved numbers to better government data that track who is in foster care and more agencies that share and implement best practices.

In Allegheny County, since Cherna arrived, the total number of children in foster care has seen a much sharper decline – 61 percent. The average caseworker now manages far fewer families – about a dozen each.

Building a better crystal ball

Still, Cherna says he’s only scratched the surface when it comes to improving the foster care system. The next critical step, he believes, relies on a concept known as predictive analytics: using data to anticipate trouble before it happens.

To keep kids out of foster care, to intervene and shore up families before they collapse, requires the ability to predict a child’s future. A crystal ball. Cherna found his solution in the form of a data model created by an economist half a world away. New Zealand economist Rhema Vaithianathan built two computer models that can predict a child’s likelihood of entering foster care.

One predicts at birth the chances that a child will be abused or neglected and ultimately end up in foster care. The model does this by weighing a number of factors: Among them, does the child live with one or both parents? Has the family undergone prior child welfare investigations? Does the family receive public benefits? The second model uses similar information but activates only when a call is placed about a child’s safety. But – and this is critical – that call often doesn’t come until after the abuse has occurred.

About two decades ago, Marc Cherna launched reforms that changed the course of the child welfare agency based in Pittsburgh, Pennsylvania. Today, the number of children in foster care has dropped by 61 percent, and roughly two-thirds of children in the system reunify with their families. Moving forward, his team weighs how to launch a system that would use data and predictive analytics to support families and prevent children from needing to enter the system. Photo by Mike Fritz

About two decades ago, Marc Cherna launched reforms that changed the course of the child welfare agency based in Pittsburgh, Pennsylvania. Photo by Mike Fritz

The first model was deemed too much too fast. It’s a version of this second, more incremental model that Cherna is hoping to unveil this spring.

Here’s how it would work. Each year, Allegheny County child welfare dispatchers receive more than 10,000 phone calls from people who suspect child abuse. These calls typically come from teachers, neighbors, doctors and family members. Under the new plan, as soon as a call came in, the program would crunch a number of data points, weighing different indicators, to produce a risk score. Indicators in an early version of the model include parents’ age, race, criminal history and marital and welfare status.

The call would then be evaluated by a team of screeners. During the evaluation process, the team would consider the nature of the call itself, but also the degree of risk. A high risk score could single out a call that might otherwise be screened out. If the case is flagged as concerning, workers could launch an investigation into the child’s living situation. The fundamental question: is the child safe in his or her home?

When Cherna’s team applied an early version of the model to historic data, what they found was startling. Among children with the highest risk score, 40 percent were removed from their homes less than a year later. Among those with the lowest risk score, the likelihood of entering foster care was dramatically lower, at 0.3 percent. Cherna’s team had found their crystal ball.

Not a done deal

But implementing the plan is not a done deal. It still must be finalized, then peer reviewed. Plus, if what happened in New Zealand is any indicator, Cherna’s team could face some real pushback.

In New Zealand, applying the tool came to a standstill after questions were raised. Incorporating factors like race and income into child care decisions could stigmatize children and families, some said. Others opposed the use of a control group in the testing process that would result in some children receiving assistance based on their risk scores but not others. (“Not on my watch, these are children not lab rats,” New Zealand’s Social Development Minister Anne Tolley’s handwritten notes said, according to The Press newspaper in July.)

Vaithianathan acknowledged the concerns. “It’s quite controversial,” she said.

A May 2015 study built on New Zealand’s at-birth model was published in May 2015 in the American Journal of Preventive Medicine. Researchers raised concerns that the use of predictive analytics to assess child abuse risk could over-represent minority groups and families on welfare to a degree “that is disproportionate to their true share of maltreatment.” And that “might have a ratchet effect that feeds a cycle of bias in surveillance.”

Predictive risk models “appear promising based on prototype research, but carry ethical risks and warrant careful feasibility study and trialing,” reads the study, which analyzed health, welfare and family criminal history data for about 94 percent of children in New Zealand from 2000 to 2012 and sought to determine whether those considered high risk had a higher likelihood of abuse by age 2.

Jay Stanley, a senior policy analyst for speech, privacy and technology at the American Civil Liberties Union, cautions that while government agencies that offer social services are mission-bound to help people, the data must be applied wisely.

“The worst case scenario is that the score is just reflecting the prejudices or beliefs of whoever scored the algorithm,” he said. “It’s very important that these things be transparent so they can be scrutinized by the public and experts.”

In Allegheny County, attorney Scott Hollander has for 16 years represented children in the county’s juvenile courts through his advocacy group, Kids Voice. And he has questions, too: Could a risk score label a child for life? Would certain groups be affected disproportionately?


But supporters say the concerns pale in comparison to the potential. Catherine Volponi, a lawyer and director of parental rights advocacy group in Allegheny County, said the model is, by its very nature, profiling people. But if it’s used to provide services to families that need them, it would benefit the at-risk kids, she said: “You’re trying to build capacity within the family so that they never come into the child welfare system,” she said.

Hollander said he’s often introduced to families when parents are just on the brink of losing their parental rights. If intervention could occur at an earlier stage, he said, “perhaps there would be a better outcome for that family.”

Emily Putnam-Hornstein is one of the directors for the Children’s Data Network at the University of Southern California and has worked with Vaithianathan to develop Allegheny County’s predictive analytics model.

“We have 6 million children reported for abuse or neglect, and how you make triaging decisions early on absolutely impacts outcomes for that child and family,” she said. The use of predictive analytics in child welfare, she said, could “change the flow of children into the system.”

Sanders draws parallels between child welfare and the airline industry. In the 1990s, the airline industry played a vicious game of whack-a-mole, Sanders said. An airplane crashed, and people were killed or injured. A retrospective review investigated what happened, and a policy or practice was considered or implemented to address what went wrong. That process repeated itself hundreds of times.

Then, officials realized data could help them predict when problems might arise and change decisions or support for crews or equipment, resulting in fewer crashes, “and that’s what’s happened,” Sanders said.

“They are able to use data to anticipate where a problem might occur versus waiting for a problem to occur and then reviewing it. That’s the kind of work that can happen in child protection.”

Already, a few jurisdictions nationwide use data to help child welfare workers before a crisis unfolds, Sanders said. For example, in 2013, child welfare workers in Hillsborough County, Florida, responded to the deaths of nine children who already had open child welfare cases by developing the Rapid Safety Feedback system. Like Allegheny County’s plan, the system uses risk scores to prioritize calls about child abuse and make decisions on how best to support the child and family.

Since the system launched, no child has died after they were referred to child welfare workers, said Bryan Lindert, who oversees the system for Eckerd, the non-profit group that uses data to help manage the county’s child protection cases.

A recent report from the Commission to Eliminate Child Abuse and Neglect Fatalities praised the system, saying it demonstrated how “the intricate dance between data and practice can keep an important sector of children safe.” And there’s interest to replicate these efforts in Alaska, Connecticut, Illinois, Oklahoma and Maine.

The Allegheny County plan has at least one major difference. It would take the Florida system a step further, pulling from law enforcement, health and education data, in addition to child welfare case history, Sanders said.

The new frontier

Two decades after Shawntee Ford’s murder, incidents of unthinkable abuse still occur in Allegheny County.

In 2013, a committee reviewed nine instances where four children died and five were severely injured due to abuse and neglect.

All of the children were under five. In one instance, twin brothers died in a fire. In another, a boy died after bleach scorched his body. One boy died as a result of blunt force trauma.

And of the nine incidents investigated, four of those children had not been involved with the county’s child welfare office in the prior 16 months.

Last fall, while working with old data from 2007 to 2014, Cherna’s team examined 41 cases where children died or nearly died as a result of abuse and neglect, including those nine children from the 2013 report. Had the at-birth model been applied, his team concluded, 30 of the children would have been flagged high risk.

Could that designation have led to inspection by child welfare workers? And could that inspection have saved these children’s lives? It’s hard to say, but it’s certainly possible.

The potential, Cherna said, “is just enormous. This is the new frontier.”

Watch: Can an innovative Pittsburgh program help repair the broken lives of foster youth?

Graphics by Vanessa Dennis