When he’s not toasting escapism, our tireless editor Mark Glaser has been asking why reporting costs so much. I can’t tell you much about investigative reporting (a $400,000 product of which started the conversation), except to say that six figure salaries do add up. But I can tell you that when it comes to local reporting, improved access to information could make a big dent in the expense of getting a story written.

If you want to take a look at distribution of discretionary funds by the New York City Council, you have to start with a 400-page PDF full of tables of information. And then you need someone on hand who knows how to pull tables from a PDF into a workable spreadsheet. That, or you need a pencil sharpener and a calculator. And while highlighters and pencil sharpeners are not blowing holes in anyone’s reporting budget, the hours required to process this information certainly are. The situation is absurd: this information started out in a database and there’s no reason that anyone — whether they’re a reporter, civic gadfly or deli manager — should have to jump through hoops to put it back into a database.

Of course, those hoops are just for information the city already makes public. If you want to know where pedestrians are being hit by cars, or how parking placards are distributed in a city where curbside space is valuable and abuse of parking privileges is well documented, you’d better know who has that data and have someone on hand who can write an iron tight FOIL request. Want to know about the distribution of lead poisoning cases in the city? For that you’ll need lawyers.

FOILs take time, which means money. Lawyers, too, tend to want money for their time. One way to make information cheaper is to step up the data requirements in local transparency laws. New York City is considering legislation that would amend existing public records laws to require that information be made available and that it “be presented and structured in a format that permits automated processing.” That is to say, raw data. Just publish it — don’t make us ask.

With the law itself lingering in committee, the mayor’s office announced a competition, NYC Big Apps, for applications that will use city data. Perhaps the idea is to deflect attention from the bill, which the mayor is no fan of. The contest, which offers a prize that includes dinner with the mayor, is not really a substitute for making data available.

Steve Romalewski, a pioneer of web-based GIS and community mapping projects, is also skeptical of the contest. He notes that it offers no explicit guarantee that any datasets will be fully available for the long haul, and that no one has offered any explanation of why just 80 data sets are included.

Romalewski also rattles off a good list of datasets that are currently only available on a per-request basis — which means, among other things, that you need to know they are there. His list includes the types and locations of small businesses, green spaces, recreational spaces and housing violations, as well as interim multiple dwellings (aka lofts) throughout the city. He also points out that land use data currently must be licensed from the city at a rate of $1,500 per year if you want all five boroughs: not a trivial expense to small projects like Gotham Gazette.

Romalewski argues that we shouldn’t have to ask for data—that most of what city agencies aggregate belongs in the public domain. I’m with him there, and curious as I am to see what comes out of NYC Big Apps, I’m not convinced that the contest going to help put city data in the public domain in New York City.

I don’t know whether or not the legislation currently sitting in committee is the answer we need, but I do know that New York City is not alone in needing far better access to the data that civil servants use and aggregate in the course of their work. I also don’t think that simply providing us with the raw data is enough — but at least it’s the bare minimum we need to fill the role of government watchdog.

By the way, if you want that list of under-publicized city data, skip to the comments in Romalewski’s post.