Visit Your Local PBS Station PBS Home PBS Home Programs A-Z TV Schedules Watch Video Donate Shop PBS Search PBS
I, Cringely - The Survival of the Nerdiest with Robert X. Cringely
Search I,Cringely:

The Pulpit
The Pulpit

<< [ Shake Your Groove Thing ]   |  Do What I Mean  |   [ Speed Bump ] >>

Weekly Column

Do What I Mean: If Web Searches Are Going to Get More Accurate, It Might Require a Technology Like MeaningMaster, Which Was 20 Years in the Making

Status: [CLOSED]
By Robert X. Cringely
bob@cringely.com

A decade after it first came on the scene, Internet searching is hot again. It isn't just the coming Google IPO with its huge numbers and sense of impending investor giddiness that is driving this effect, but a realization that, along with e-mail, searching literally IS the Internet. Browsing and downloading and playing are cool, but without first finding what you want to browse, download, or play, well those other things can't happen. So anything that makes searching significantly easier is important, which is why I am very interested in a company called MeaningMaster that claims to have technology that can make English language searches three times as accurate as they presently are using Google.

MeaningMaster isn't a search engine, but a search technology. There is no huge web site called MeaningMaster that indexes the web, at least not yet. First, according to its backers, MeaningMaster will be used to index major corporate web sites, and only later will one or more retail search portals use the technology. So while MeaningMaster can be seen as a threat to Google, it could just as easily become a part of Google or even a part of Microsoft's upcoming web search engine or of A9, the web search portal just opened by Amazon.com, or maybe a part of every major search engine. Only time will tell.

Until then, MeaningMaster is generating a lot of buzz in Silicon Valley -- yet another overnight sensation that was 20 years in the making.

MeaningMaster is the brainchild of Kathleen Dahlgren, a computational linguistics PhD who has spent most of her career building a lexicon of the English language. This lexicon is a computer dictionary that is purported to understand the meanings of more than 200,000 English words IN CONTEXT. Dahlgren began this project in the 1980s when she worked for IBM, then took it with her to a startup called Inquizit where she was granted in 1998 a patent on the technology. Inquizit got some attention in the press back then when it was proposing to do pretty much what MeaningMaster is proposing to do now, leading one to wonder what has changed to make the approach so much more interesting today than it was six years ago?

What has changed is that, through the relentless passage of Moore's Law, computers are on average 16 times faster today than they were back in 1998. Today, MeaningMaster claims a server can process 50,000 queries per hour, though they are careful not to specify either the power of the server or the complexity of the query, though with modern brute force approaches like Google's swarm of PC servers, it probably doesn't matter. Where Inquizit was interesting, but probably not competitive, MeaningMaster is now competitive.

This makes me wonder, in fact, whether there aren't hundreds of promising technologies from the late 1990s that are worth another look today. It would probably be worthwhile to start a company just to specialize in this type of digital archaeology.

So MeaningMaster is back and presents a natural language interface that purports to return more of what you really want to know. This is Artificial Intelligence, which had us all so excited in the 1980s until we found how slow and difficult to do it really is. But that very difficulty is supposed to be MeaningMaster's strength, because to do what these people claim to have done, which is essentially connecting 200,000 words to each other in terms of meaning, can't be done with algorithms alone. You can't just write a program to parse Webster's Dictionary and make this happen overnight.

"We model the way people interpret the meanings of a word -- through context," says Ms. Dahlgren, who is today CEO of MeaningMaster. "We search on meaning by using grammar and structure and semantics. Every word has associated with it a set of beliefs." In order to unlock that set of beliefs, MeaningMaster is hand-coded, a process that took 175 man- and woman-years. So, to answer the rhetorical question asked by every VC, Microsoft COULD just duplicate the work from scratch, but they'd still have to fight the patent and they'd probably spend three to four years on the project before having anything remotely usable. Wouldn't it be easier just to license the technology or to buy it outright?

Assuming MeaningMaster actually can tell the difference between "Who has set the tables?" and "Who has the sets of tables?" then the technology should bring a third approach to searching to go along with pattern-matching and Google's PageRank.

Google, by the way, is one of the companies that is right now taking a look at MeaningMaster and their interest is what prompts this column. If Google is impressed, I'm impressed.

But it isn’t just searching that has the industry buzzing about MeaningMaster. It's the ability to use this technology for contextual advertising that has the boys and girls really excited. Though Google's PageRank is being muddied somewhat by web logs skewing scores, few people are complaining about the accuracy of the leading search engine and its web cache remains unrivalled. The people who ARE complaining are advertisers who pay search engines to present their ads when certain words are queried: the more accurate the response, the more ads will be served to those who really ought to be interested, and the more sales that will result.

I asked Graham Spencer to take a look at MeaningMaster. Graham was the chief techie at Excite where he pioneered yet another search technique involving linguistic vector analysis that still offers some advantages, too.

"It looks interesting," said Graham, "but I found it to have some obvious gaps. The problem with any technology that tries to be explicitly 'smart' is that it has to be really close to perfect or else a human will notice. Google, for example, doesn't have this problem because there's no a priori standard of how PageRank should work, so a human can't tell if PageRank is 100 percent accurate or just 90 percent. But the numerous pre-crash efforts to build an automatic Yahoo-style directory through statistical methods always had this problem -- users know what to expect, and so they're easily disappointed. I think MeaningMaster will have a similar problem -- their magic of automatically choosing related terms is explicit, but it doesn't work all the time. The other obvious issue is scalability. There's performance and there's also the reliability of their algorithms as the data set gets much larger and dirtier. Usually a larger data set is better, but you won't really know until you test."

MeaningMaster claims the testing is over and their product scales beautifully, thanks. And if it is really hard-coded -- a sort of semantic sieve -- then it should scale fine.

Only time will tell if MeaningMaster annoys users or delights them, but if its real strength is for targeted advertising then the annoyance factor could be practically eliminated as long as advertisers were seeing improvements in converting clicks into sales. That's the REAL test.

And I still think that MeaningMaster could be one heck of a crossword puzzle solver.

Comments from the Tribe

Status: [CLOSED] read all comments (0)