Back in early 2008, as I headed off to a conference at Georgia Tech, I wrote a post for Idealab headlined "Computation + Technology = ?"
Two recent developments suggest that we're starting to find answers to that question -- and more importantly, that there's a growing number of people trying to find these answers. Duke University has released an interesting report, and a group of journalists and technologists has begun meeting in Silicon Valley to address challenges that journalists and technologists might tackle together.
The February 2008 conference at Georgia Tech, entitled "Journalism 3G: The Future of Technology in the Field," introduced many of its 200+ attendees to the idea of computational journalism -- applying computer programming to the challenges facing journalism, journalists and a society that needs original reporting to provide information for citizens in a democracy. Two of the other attendees were the first Knight News Challenge "programmer-journalist" scholarship winners: computer programmers enrolled in the master's program at the Medill School of Journalism at Northwestern University.
When the John S. and James L. Knight Foundation awarded the scholarship grant to Medill in 2007, the idea of teaching journalism to technology professionals seemed odd to many people -- both journalists and technologists. But now there seem to be a lot of initiatives aimed at addressing the same set of issues.
Duke University, through its DeWitt Wallace Center for Media and Democracy, built on the ideas generated by the Georgia Tech conference in a couple of ways. First, the center created -- and has now filled -- a faculty position specializing in the field. The new Knight professor of the practice of journalism and public policy is an old friend, Sarah Cohen, previously database editor for The Washington Post, where she contributed to countless enterprise reporting projects, including a Pulitzer-winning investigation of child welfare agencies in the District of Columbia. Besides teaching courses, Cohen is expected to lead the development of open-source reporting tools designed to make it easier for journalists to discover and research stories.
Earlier this month, Duke released "Accountability Through Algorithm: Developing the Field of Computational Journalism," a report based on a workshop held in July. The report is full of interesting ideas for applying technology to journalists' challenges. Here are a few of them.
Information Extraction, Integration and Visualization
A new set of tools would help reporters find patterns in otherwise unstructured or unsearchable information. For instance, the Obama administration posted letters from dozens of interest groups providing advice on issues, but the letters were not searchable. A text-extraction tool would allow reporters to feed PDF documents into a Web service and return a version that could be indexed and searched. The software might also make it easy to tag documents with metadata such as people's names, places and dates. Another idea is to improve automatic transcription software for audio and video files, often available (but not transcribed) for government meetings and many court hearings.
The report also suggests developing "lightweight" templates that enable journalists to create data visualizations based on XML or spreadsheet files, and tools that help them organize their findings in a timeline. As the report points out, reporters working on in-depth projects often create chronologies in lengthy spreadsheets or text documents. A better tool would let journalists "zoom in, tag events for publication, turn on and off players or events and otherwise use them effectively," the report says.
The Journalist's Dashboard
Here the Duke report suggests that journalists need "a tool with which to spot what's new and what's important in the flow of daily information." A dashboard could include:
- A news alert system similar to Google News that scanned only the sources specified by a beat reporter,identifying the originating publisher and the number of other sites that linked to the item;
- A tool helping journalists keep track of their sources, including news items about that person and citations from the reporter's own archived stories mentioning him or her;
- A "trends and outliers" tool that might generate an alert any time a data source reveals a significant change in a piece of data -- say, a surge in monthly expenditures by a government agency, or a flurry of crime reports in a short period of time.
- A timeline generator that would display incidents related to a particular story as well as coverage on blogs and news sites.
- An annotator that would allow a reporter to see past stories, images and contextual information while writing -- for instance, by displaying background information about the person being written about. (This idea bears some similarity to the EasyWriter tool developed this spring by students in a Northwestern University journalism/technology class.)
Philip Bennett, formerly managing editor of the Washington Post and now a professor at Duke, is quoted in the report describing a new approach to investigative projects that engages and taps into reader interest. Instead of seeing long-term investigative projects ending with publication of a package of stories, the initial investigation could serve as just the midpoint in the reporting process. Stories could be presented in ways that enabled each reader to explore the story in layers, giving each a "differentiated news experience depending on her interests." Bennett suggests that a series like the Post's Pulitzer-winning investigation of Walter Reed Army Medical Center could have become a focal point for readers interested in veterans' issues. "If the paper could nurture a community of interest around the story, readers might use the site as a discussion place for the action that follows from the investigation," the report says.
Applying 'Sensemaking' Approaches From Other Fields
The Duke report points out that academic researchers are wrestling with many of the same challenges that journalists face and suggests that their solutions could be helpful. For instance, Georgia Tech researchers have built a tool called Jigsaw that creates visualizations to display connections between individuals and entities mentioned in different documents -- something every investigative reporter would lust for. And the Muninn Project, an interdisciplinary research project focusing on World War I records, is seeking to convert images of handwritten forms into machine-readable databases -- a problem faced by journalists in many states that allow political candidates to file handwritten campaign contribution reports..
Another new development worth taking note of: a new "Hacks and Hackers" Meetup group formed in Silicon Valley by former Associated Press foreign correspondent Burt Herman, who is on leave from the AP and recently completed a Knight fellowship at Stanford University. The group -- billed as being "for hackers exploring technologies to filter and visualize information, and for journalists who use technology to find and tell stories" -- held its first meeting Nov. 19.
The first gathering attracted about 30 people, including people from Google and Google News, Yahoo, sfgate.com, the San Francisco Chronicle, Current TV, PARC (Palo Alto Research Center), and Topix.com, Herman reported. "It felt like the seeds of a movement, and the many lively conversations showed that everyone was able to find common ground," he wrote in an email to me.
Herman said his Knight fellowship -- during which he focused on innovation and entrepreneurship -- taught him that innovation requires bringing people from different disciplines together.
"I started the Hacks and Hackers meetup group to open a broader dialogue between technologists and journalists, so we can move past the endless hand-wringing about the future of news and get down to work building it," Herman said. "Technology and media come together here in Silicon Valley like nowhere else in the world, and there was no group yet focused on this. I'm hoping it will lead to better understanding and perhaps even spawn new ventures."
As some readers of this blog will remember, "Hacks and Hackers" is also the name that Aron Pilhofer and I came up with to describe a new organization and Web site for people working at the intersection of technology and journalism. At the Future of News and Civic Media Conference in June, Aron and I won a $2,000 prize to create an online community for people with these interests.
The Web community idea is still in the early stages of development, but Aron and I would welcome your ideas about how best to make it work. The original concept was to create a place where members can seek help solving problems and provide assistance to their peers by, for instance, sharing a tutorial for a project using Django or Ruby on Rails or Drupal. We know there are people -- in journalism and technology, in industry and academia, scattered through organizations such as the Online News Association, Investigative Reporters and Editors and the Society for News Design -- who can use each other's help and support. We like the idea of having some kind of reputation management system -- say, like Stack Overflow -- that would reward members based on the quality and quantity of their contributions to the community.
If you have ideas for the Hacks and Hackers site, please post them in the comments below or email me at richgor - at - northwestern.edu.