Visit Your Local PBS Station PBS Home PBS Home Programs A-Z TV Schedules Watch Video Support PBS Shop PBS Search PBS
I, Cringely - The Survival of the Nerdiest with Robert X. Cringely
Search I,Cringely:

The Pulpit
The Pulpit

<< [ Steal This Column ]   |  When and Where  |   [ Eating Our Seed Corn ] >>

Weekly Column

When and Where: A Decade Into the World Wide Web, We Still Don't Generally Know How Old Information Is or Where Users and Servers Reside in the World

Status: [CLOSED]
By Robert X. Cringely
bob@cringely.com

We could spend another week figuring out how to all become felons and thereby bring the Digital Millennium Copyright Act to its knees, but let's not. The most sublime suggestion I got this week was that we could all become criminals simply by WATCHING a pirated movie. So all someone has to do is start looping a copyrighted film over the Net and have the rest of us tune-in like the legal lemmings we want to be, then report ourselves to the Feds. That ought to do it. Now let's move on to something else.

Down at the journalism school I never went to, they teach junior reporters to start a story by telling who, what, when, where, and how. Some people like to throw "why" in there, too, but it really doesn't belong since "why" is subjective and news reporting is always supposed to be objective. Those four W's and an H do a pretty good job of characterizing events whether it is in the newspaper or in real life. If you know these things, you know a lot. Unfortunately, at least two of those W's are generally unavailable on the Internet, and it doesn't have to be that way.

If this was 1998, the idea I'm about to share with you would be worth at least $10 million in venture funding and a BMW M3, but today, it is of course worth nothing and I drive a minivan. It has always bugged me when using a search engine — ANY search engine (they are all equally bad in this one respect) — that the results contain lots of old information. Sometimes I want old information, of course, to get some sense of how things were at a particular time, but generally what I want is the most recent information. If I am looking up pork belly prices, I want to know TODAY'S pork belly prices. Alas, I haven't yet found a search engine that will give me just the newest stuff or that will at least indicate to me what stuff is new and what is old and maybe let me sort by age. This is what I want: search results that indicate their age.

It can't be that hard to do. Files have creation dates, for one thing. And search engines that have been in operation for years ought to be able to keep track of which results have been the same for a long time and which are clearly brand new. Maybe they can be ordered by age. Or since that would violate the philosophy of Google, which indicates importance by the number of links connected to each result, maybe it would be better to change color or have some kind of bar graph tell me how old is each result.

So that's my big idea. No need to thank me.

The other "W" that is missing is "where." In the digital clarity, that's the Internet, it isn't supposed to matter where information is held or where users reside. Except it does matter. It matters a great deal. Multinational web sites always want to know where you are coming from so they can push their American whatzits at you instead of their Chinese whatzits. They generally do this by making you choose a country or locality. But can't that be done automatically?

Not until now. Not until the advent of a new service called CountryHawk, which I think is a bad name for a good service. CountryHawk is from CyScape, a company whose other weirdly-named product is BrowserHawk, a tool for making sure web site code will work equally well with all browser types.

CountryHawk, which I think is cool technology, provides a way to take an IP address and determine with 95 to 98 percent accuracy the country in which it's based. It operates quickly and without network traffic, using an internal database that updates monthly. In benchmarks, it handles 50,000 to 100,000 queries per second. CountryHawk currently has Java (servlet/JSP) and ASP versions.

Potential uses include:

  • Restricting software downloads to "Terrorist 7" nations
  • Reducing credit card fraud
  • Preventing password sharing
  • Web server log stats and analysis
  • Digital rights management
  • Localization
  • Auto-select of country on web forms
  • Auto-jump to a regional web site
  • Geo-targeting for increased click-throughs
Most people needing a solution to this kind of problem have previously used reverse DNS to determine countries. This is a horrible abuse of technology, since reverse DNS — which maps a domain name to an IP address — is a very inefficient process that was never intended to be done in high volumes. Many .com /.net/.org can't be found at all using reverse DNS. And reverse DNS is s-l-o-w. I know from personal experience that some IPs aren't in DNS and those take 30 seconds to timeout! Plus DNS doesn't seem to cache timeouts, so with reverse, DNS you get hit again and again.

Another, equally crude, alternative to reverse DNS is to ask the destination server what time it is. Unix boxes, especially, will give you back both the time and also the correction factor to convert back to Greenwich Mean Time. This means that many servers can be identified as being in a single time zone, but that still doesn't tell us whether the server is in the southern or northern hemisphere. It is just not good enough.

But CountryHawk does its job differently. It doesn't use reverse DNS or reach out across the network at all. CountryHawk uses a proprietary database of four billion IP addresses to determine what country an IP address is supposed to be in. How they come up with that database is a big secret, but the fact is that even a heroic effort that is shared across thousands or tens of thousands of servers will have dramatically lower impact on the Net.

Of course, there is a societal downside to this technology. Once it is easy to determine what country a user resides in, it will be possible to restrict information that is available to that user. Or looking at it from a commercial sense, it will be possible to charge country-specific royalties just as the region code on DVDs effectively does today. One might argue that it is better NOT to know where users or information are coming from. But that's na�ve. The Internet is becoming increasingly organized, and it probably has to just in order to manage its own spectacular growth.

Products like CountryHawk were inevitable and are generally more good than bad. And problems of restricting information have to be dealt with directly. Or not. Suddenly, I see a whole new opportunity to create technology to make users look like they are coming from a place they really aren't.



Comments from the Tribe

Status: [CLOSED] read all comments (0)