Technology »

Underwritten by John S. and James L. Knight Foundation

Idea Lab is a group blog by innovators who are reinventing community news for the Digital Age.

Read more about Idea Lab »

  • Check out Idea Lab Sponsorship opportunities!

  • Follow us on Twitter »
  • Each Idea Lab blogger is a winner of the Knight News Challenge grant to reshape community news.

    Learn more about the Knight News Challenge »

    PANDA Survey Shows Newsrooms Swimming in Data

    Knight 2011 News Challenge Winner

    The first PANDA task officially checked off our to-do list was the drafting of our Future Users Survey. We distributed a link to the survey via Twitter, the NICAR-L mailing list and email. The PANDA project aims to make basic data analysis quick and easy for news organizations, and make data sharing simple. The survey covers a range of topics that we felt were crucial to understanding our future users, including the technical aptitude of the staff in their newsrooms, the quantity of data they work with, and possible barriers to using the software.

    So far, we've had 77 responses to the survey. For the curious, we've put an anonymized summary of the results online.

    Our future users

    We had responses from many of the major newsrooms around the country. But we must hesitate before making any final conclusions, because it looks like we didn't reach too many smaller newsrooms. Keeping that in mind, several interesting statistics are in the data.

    Most of the newsrooms surveyed indicated they are technically savvy, with 74 percent reporting they are likely to support running applications in-house. A slightly smaller group, 57 percent, are DocumentCloud users, a factor that we look at as an indicator of organizations that would be willing to adopt new newsroom tools.

    Big data

    One of the most striking things to emerge from the survey is the quantity of data our future users reported working with. Thirty-eight percent of users reported working with a single dataset that is in excess of 10 million rows. In the nearly two years I worked at the Chicago Tribune, we only saw a handful of datasets at this scale. Furthermore, 36 percent of users reported having a cumulative quantity of data in the range of hundreds of millions, or even billions, of rows. We've reached out to users who reported these "big data" numbers in order to get a better grasp of what sorts of data they are working with. The answer will inform our approach to an interesting design challenge: determining what scale we intend to support and how much time we will invest in documenting strategies for scaling beyond those initial limits.

    Tools of the trade

    The survey also inquired about the technology used within newsrooms in hopes of gaining an understanding of what tools are already in widespread use. A few quick hits from the results:

    • 86 percent use at least one Google utility -- Docs, Fusion Tables or Refine.
    • 86 percent reported using at least one SQL database.
    • 75 percent use at least one programming language.
    • 58 percent of newsrooms use Python, by far the most of any particular programming language. This bodes well for PANDA's ability to find a niche of power users and contributors.

    Security

    We provided respondents the opportunity to sound off on what sorts of issues might prevent them from using PANDA as a hosted service, if that is what we decide to build. A large number indicated that they had security concerns about putting data online, and several stated outright that they would not, or could not, use a hosted service. We haven't made a decision about whether PANDA should be a hosted service, but these results will certainly guide our thoughts.

    These statistics provide a clear, empirical picture of our audience. At ONA we will meet for a planning session, and this survey will factor heavily into the road map we build for the rest of the year. We will also be trying to interview more future users, and following up with some who replied to the survey. If you will be at ONA, please find one of us in the red PANDA shirts and let us know what PANDA can do to better serve your newsroom. It's also not too late to fill out the survey. If you haven't taken it, please take a few minutes to do so here.

    panda_shirt.jpg

    Rate this entry

    • Currently 0/5
    • 1
    • 2
    • 3
    • 4
    • 5

    Rating: 0/5 (0 votes cast)

    Check out MediaShift Sponsorship opportunities!

    Featured Comment

    I think newspapers, blogs, and magazines should all be doing audio versions. I grew up enjoying and listening to audiobooks and now I don't have the same option for the short form content that I prefer to consume.

    Will Mayo
    Do Touch That Dial: Turn Your Newspaper Into a Radio Station

    Newsletters

    MediaShift delivers the best news on media and technology directly to your in-box.

    Monthly Archives