The International Press Telecommunications Council (IPTC) has just launched rNews, a consistent, machine-readable way of expressing news metadata in RDFa (a linked data language). This post explains some of the differences between rNews and hNews and why, if you publish news on the web, you ought to be using one or the other.
In a now infamous incident at Cambridge University back in October 1946, mid-way through a seminar, the philosopher Ludwig Wittgenstein is said to have threatened the philosopher Karl Popper with a red-hot poker (the exact circumstances and use of the poker are still disputed, 65 years on). The argument? Over whether there are, or are not, such things as philosophical problems. Popper said there were, Wittgenstein said there were only puzzles.
Step into the similarly rarefied world of online publishing languages and, though you might not be threatened with a red-hot poker, someone will almost certainly wave its online equivalent at you -- as we found when we were developing hNews -- a news microformat -- with the Associated Press.
We started, back in 2008, with a problem: Very few online news stories had consistent, machine-readable information about their provenance (i.e. basic stuff like who wrote it, who published it, when it was first published, etc.). This was a problem because without this information -- or metadata -- it was incredibly difficult to differentiate news from other content on the web, or to figure out where news had come from.
Two Solutions to the Problem
We searched about for a solution to the problem, thanks to grants from the Knight and MacArthur Foundations, and found not one but two. The first was microformats -- which are straightforward, open mark-up formats built on existing standards. The second was RDFa, a method of embedding full RDF, the linked data language of the semantic web.
We made a decision to use microformats. We did this for highly pragmatic reasons. We figured that most news organizations (and journalists and bloggers) were not yet ready to make the big leap to linked data. The easier we made it to integrate consistent metadata, we thought, the more likely news organizations were to do it. Our chief concern was less about exactly how people made the provenance of online news more transparent, just that they did it.
The Associated Press came to a similar conclusion, and together we developed hNews. Our pragmatism has so far borne fruit. The hNews microformat has since been integrated in about 1,200 news sites in the U.S. This means that there must now be a hundred-plus million news stories on the web with hNews. And, the AP has based its new news registry business and its forthcoming rights clearinghouse around hNews.
This did not stop some semantic web evangelists from waving their metaphorical red-hot pokers, or from suggesting we were not born of parents in wedlock or other less warm and fuzzy responses.
So, when we learned that the IPTC were launching an equivalent of hNews in RDFa we were over the moon. Hooray! Now people have a choice to mark up their news in microformats or in linked data.
The Ambitious rNews
"Equivalent" is not quite right. rNews is more ambitious than hNews. If hNews is like a ham sandwich then rNews is like a baked Alaska. rNews covers lots of aspects of provenance and content. You can, if you want to mark up additional aspects of news stories, mix-and-match rNews with other RDF ontologies (i.e. different linked data vocabularies). It's also more "correct" than hNews, but as a result more verbose and intrusive. It's a much bigger change to existing HTML pages than hNews. That said, it is, by RDF standards, pretty straightforward. All this makes it a very good alternative way of creating consistent, machine-readable mark-up for news.
The big difference between two is in their complexity. Making a ham sandwich is much simpler and requires less expertise than cooking a baked Alaska. The same goes for hNews and rNews. As a result, my prediction is that rNews will be the format of choice for big news organizations who want to do things fully and properly and are willing to commit the time and resources (like the New York Times -- which was central to the development of rNews). In the same way it will probably suit high end proprietary content management systems. For smaller news organizations, journalists and bloggers, hNews goes a good part of the way there and is much easier to integrate and lighter to use.
In other words, the two complement each other rather well, and ought to provide the foundations for consistent, machine-readable metadata for news.
Pros and Cons of Each Approach
The AP's Stuart Myles was one of the creators of hNews and worked with the IPTC on rNews.
"The fact that hNews and rNews have similar names is no coincidence," Myles told me via email. "To me, microformats and RDFa are two different technical approaches to the same challenge. Each approach has pros and cons and many tools that support one also work with the other."
Evan Sandhaus of the New York Times, one of the original authors of rNews, also emphasizes the compatibility of the two standards: "rNews was designed from the start to provide publishers with many of the same features offered by hNews. And future versions of the rNews will likely bring the standards into even closer alignment," he told me via email.
Should you care about hNews and rNews? If you publish news on the web then you most certainly should. The arrival of rNews and the continuing take-up of hNews show that metadata is central to the future of digital news. Consistent, machine-readable metadata makes your news easier to find, more distinguishable, more straightforward to check, more programmable, more targetable, and less hard to track. If you are not yet publishing your news with metadata then don't be surprised if someone soon comes at you flailing a red-hot poker.