As “Climate Change” Fades from Government Sites, a Struggle to Archive Data
When the Environmental Protection Agency’s website underwent an overhaul of climate change information on a Friday in late April, Toly Rinberg and Andrew Bergman, both Harvard Ph.D. students in applied physics, set off to figure out what was gone.
Sitting in their shared Washington, D.C. apartment, they started a spreadsheet to track the changes. Suddenly missing, they noticed, were scores of pages dedicated to helping state and local governments deal with climate change. The EPA site where those pages lived, titled “Climate and Energy Resources for State, Local, and Tribal Governments,” would disappear for three months, only to come back in July without the word “climate” in its title. The new website now focuses only on energy policy and resources, and is down to 175 pages from 380.
Also affected by the April changes was a website on the Obama-era Clean Power Plan, which had previously included fact sheets on carbon pollution from power plants and the impact of those emissions on different groups across the country. In its place was a new site featuring a picture of President Donald Trump signing an executive order aimed at dismantling the Clean Power Plan.
The roommates had taken a leave of absence to help lead the Environmental Data & Governance Initiative (EDGI), a mostly volunteer network formed after President Trump’s election that monitors about 25,000 federal web pages each week for changes. They found references to climate change removed from seven federal agencies’ websites, with some of the biggest changes happening on the EPA site, where terms like “greenhouse gasses,” “carbon” and “climate change” have been reduced and replaced with terms like “sustainability,” “emissions” and “air pollution.” And this week, they reported that in September, the agency stripped links from its “Greening EPA” website, which tracks the agency’s work to reduce the environmental impact of its own facilities and operations. Links to the agency’s climate change adaptation plans were removed, according to the report, as was language around a more specific EPA goal to purchase renewable energy to cover all of its nationwide electricity use.
“Things are meant to look like they’re less about climate change and more about environment or sustainability,” said Bergman, who called the removal of information “substantial.” “When the government is doing something systematically to some extent, the public should know about it and take them seriously and respond as they see fit.”
Nearly one year into the Trump presidency, groups like EDGI are grappling with how to preserve mass amounts of federal climate information that researchers fear is at risk. While information is not being outright destroyed, researchers and environmental groups say that changes to federal websites have made information about climate change less accessible to the public, reflecting what critics deride as an effort by the Trump administration to censor dialogue about climate change, downplay its risks and pushback against scientific consensus.
Since taking office, President Trump has made undoing Obama’s environmental legacy a central part of his agenda. In June, the president announced that he would withdraw the United States from the Paris climate accord, and to date, his EPA has moved to delay or roll back more than two dozen environmental regulations, including the Clean Power Plan, which sought to limit emissions from coal-fired power plants. Critics have denounced changes to federal sites as just one part of this broader effort.
Administration officials have downplayed the changes, describing them as part of the routine process that occurs when the federal government transitions from one administration to another. The EPA did not respond to multiple requests for comment, but in an April 28 statement, the agency said that changes to its site were designed to remove outdated language and “reflect the agency’s new direction under President Donald Trump and Administrator Scott Pruitt.”
The EPA has discretion in what it displays online, according to environmental law experts. Still, Elena Saxonhouse, a senior attorney for the Sierra Club, argued that removing references to climate change violates the agency’s public duty to be objective. In January, the Sierra Club submitted a Freedom of Information Act (FOIA) request seeking a series of EPA records, many pertaining to climate change, triggering a requirement for the agency to preserve them while the FOIA is pending.
“They haven’t pointed to any facts or research that has changed,” said Saxonhouse. “They’re just removing information from the website, and that’s what makes it seem so politically driven.”
EDGI has also submitted FOIA requests asking the EPA for protocols guiding changes to its website and is currently working with the nonprofit Climate Central to build a “risk map” to identify information that might be most vulnerable to budget cuts. Since forming in November, EDGI has taken a leading role in the effort of a makeshift coalition of researchers, scientists and concerned citizens to not just monitor federal websites, but archive data from them for future generations.
The effort has seen thousands of volunteers come together in the months since President Trump’s inauguration. At “data rescue” events across the country, they uploaded webpages and federal datasets onto different archiving sites like Data Refuge. Major cities like Chicago and Boston have also created their own websites to host basic climate change information.
But the amount of information that could be archived is overwhelming. For example, according to its website, the National Oceanic and Atmospheric Administration (NOAA) contains over 20 petabytes — 1 petabyte is the equivalent of about 20 million four-drawer filing cabinets filled with text.
“There was such grassroots enthusiasm to save everything related to climate change because it all might be endangered,” said Jefferson Bailey, the director of web archiving for the Internet Archive, a digital library that has partnered with the data groups. “Saving everything forever is not feasible and also not something you want to attempt to do.”
Coordination has been another obstacle. When organizers over the summer tried to track where back-up copies lived, they discovered a problem. Volunteers had often downloaded the same material, and there had been no system to verify that each version hadn’t been altered — even accidentally — from the original.
“It was one of those things that was in the back of our heads, but we were like, ‘first let’s get the data and we can worry about all the ways it can be served and referenced later,’” said Jeff Liu, a Ph.D. candidate in civil engineering and computation at MIT who helped run a data rescue event in Boston in February. “We didn’t realize how much of a problem it would be.”
Volunteers, including members of EDGI, have partnered with the technology firm Protocol Labs and qri.io, an open-data start-up, to work on a solution called Data Together, a project aimed at creating a model for communities to coordinate data storage. By using a distributed storage system where files receive a unique signature based on their content, institutions could verify that they hold authentic copies of information. Universities and archives would be able to volunteer server space and share files on the system.
“We’re trying to think about data as something we all hold together,” said Brendan O’Brian, who works on Data Together through his start-up. “As a community, we can decide what is and is not important for archival.”
Some, however worry about the limits volunteers face without the expertise of federal researchers behind the data.
“At the end of the day, we have to rely on the federal scientists to publish the data using good standards and practices,” said Maxwell Ogden, the director of Code for Science and Society, a nonprofit involved in government data archiving.
Sayeed Choudhury, the associate dean for research data management at Johns Hopkins University, explained that even if data rescue volunteers could “magically gather all of the data,” it would not be enough.
“It’s understanding the context, understanding the methods that are used to create it and that may be used to interpret it,” he said. “Those may or may not be sitting alongside the data.”
Some saw the initial alarm to catalog federal data as an overreaction. H. Sterling Burnett, a senior fellow on environmental policy at the Heartland Institute, a libertarian think tank that rejects the scientific consensus on climate change, called the effort “all smoke and no fire.” Burnett said that the EPA has no obligation to discuss climate change on its website.
“Just because it’s not on the website doesn’t mean it’s going to disappear,” he said. “There’s no evidence he [Trump] was going to scrub the computers.”
Yet some in Congress are calling for the administration to explain its actions. Last month, seven Democratic senators wrote a letter to Pruitt, asking that he reinstate the original version of the Climate and Energy Resources website. The senators said the changes appeared “designed to censor dialogue about climate change in the United States,” and asked the agency to explain each revision to the site by next week.
Volunteers also point out that changes involving information about climate change have touched multiple agencies. In January, a page titled “Climate Action Report” with links to reports on U.S. progress on international climate goals was removed from the State Department’s website. On the Department of Transportation’s Federal Highway Administration’s website, references to “climate change” and “greenhouse gases” were replaced with terms like “sustainability” and “emissions.”
“Those big data sets held at agencies usually are well curated but if the agency website goes away, it’s not clear what would happen to that data,” said Ruth Duerr, a former data manager at the National Snow and Ice Data Center, a government-backed research agency in Colorado. “The data may exist, it may be preserved under curatorial care, but if there’s no way for external users to access it, the public won’t know it exists.”