“It doesn’t matter what Internet business you’re in,” Richard Jalichandra, the CEO of blog search engine Technorati told me recently. “You’re either going to have direct or indirect competition with Google and that’s just the way it is…[Google is] not the 800 pound gorilla, it’s the 80,000 pound gorilla.”

But unlike most competitors to Google, Technorati still seems to have a legitimate shot at beating Google in its niche. Though Google’s main search engine has been dominant, its Google Blog Search, launched in 2005, has failed to gain similar market share. And though some traffic analysis sites have reported that Google Blog Search is edging out Technorati in terms of search traffic, they also note that more than half of that traffic stems from links on Google News.

Jalichandra claimed that in terms of front page visits, Technorati still outperforms its Google counterpart. According to Compete.com, Technorati received 2.5 million unique visitors in November while Google Blog Search received 1 million.

i-f52994fd4049484d0d5e171fa76eeead-richardjpg
Richard Jalichandra

Perhaps the reason why there is still a struggle is because the blog search industry continues to battle a number of unresolved issues. When Larry Page and Sergey Brin created Google in the late 1990s, they discovered a way of finding the most relevant results for a given search term based largely on the number of inbound links a site has. But when people use blog search engines they’re often performing an entirely different kind of search, one in which relevancy only plays a small part; sometimes they are searching for the most recent content as well.

“One of the things that we’ve learned is that 99% of all blog searches aim for something less than six months old,” Jalichandra said. “About 92% are looking for something less than a month old and over 70% are looking for something a week old.”

As a heavy blog search engine user myself, it’s not uncommon for me to conduct a search for content that’s been published within the last day, or in some cases — specifically when there’s some kind of breaking news or event being live-blogged — within the last five minutes. But when a search engine begins ignoring relevancy for speed, that perennial party-pooper spam rears its ugly head. Because of the relative ease of setting up a blog, spammers are able to create thousands in a single weekend and program them to scrape and steal content from millions of sites. Blog search engines are then bombarded with ping queries that inevitably result in spam-filled search results.

But perhaps trumping all issues is the blog search engine’s still-ongoing quest for monetization. People use blog search engines much differently than they use regular ones; rarely will you find someone using Technorati to search for an electrician or where to rent an apartment. It’s because of this fact that keyword search advertising — an enormous moneymaker for Google’s main engine — has been unsuccessful on Technorati.

“Blog search is very different,” Jalichandra said. “Blog search users are wanting to find content; they’re not necessarily looking for a plumber…With blog search people are really interested in looking for conversations or participating in conversations and it’s a very different reason for searching.”

Technorati’s ‘Identity Crisis’

The realization that keyword advertising wouldn’t be a very effective moneymaking tool forced Technorati into what Dorion Carroll, the company’s VP of engineering, called an “identity crisis.” The website has had more than a half-dozen face-lifts since it first launched in November 2002, some of which radically redefined what kind of content would be delivered on its front page.

i-d22169a7cc316b29f8790e7e868678d1-dorion.JPG
Dorion Carroll

And with those changes came some accusations from tech bloggers over the years that Technorati had taken its eye off its core feature: search. For a long period of time the search bar, what some would consider its most important tool, migrated to the right of the page, a move that some cynics would read as Technorati downgrading the search into a secondary function. The company became an aggregator of content, trying to display the most popular video, news articles and posts that were being discussed in the blogosphere. Later the company developed a Techmeme-like algorithm that highlighted important blog posts and articles that were gaining steam on the Net.

“In early 2007 or late 2006 we shifted a lot of our focus — and really in hindsight we did it mistakenly — to people who maybe didn’t even neccessarily know what blogs were or care,” Carroll said. “And that’s really where a lot of search-oriented features got downplayed. And again it was probably a mistake, when really our service is to be a resource for bloggers and people who read blogs and the marketers who want to reach or influence those bloggers for their readers. What we do stand for is being the center of the blogosphere.”

Reliability Problems

But perhaps Technorati’s biggest problem came about a year ago when there were constant outages and bloggers were reporting not being able to access the site for hours at a time. And even when the site would load it would sometimes take over a minute to complete a search or return no results for a popular term. The Technorati Monster is the search engine’s downtime mascot, similar to Twitter’s Fail Whale. Eventually, the monster had escaped so many times that it seemed that its cage door was permanently left open.

“Spam has historically been a very big problem,” Carroll told me. “Over the summer we had the monster outages directly as a result of a significant increase both in terms of spam trying to get into the system, and then probably moreso — and where the monsters came from — the spammers actually querying our search engine.”

Because spam blogs (known as “splogs”) benefit in blog search results by offering up constantly updated content, they continually scrape from other websites and then send thousands upon thousands of pings to search engine crawlers. Those pings inundated Technorati’s servers and debilitated their capacity for crawling legitimate blogs. Carroll said that at one point they were having to deal with five times the normal traffic, all because of spammers, a trend that brought the site to its knees.

Nicholas Carr, a tech writer and author of the widely circulated Atlantic essay, “Is Google Making Us Stupid?,” recently addressed Technorati’s reliability problem on his blog. In a post titled “The Centripetal Web,” he noted that a number of small-staffed Web 2.0 companies — Technorati and the RSS reader Bloglines, for instance — get overtaken by Google’s me-too products precisely because it has the infrastructure, large staff and massive server farms needed to handle millions of users. So even if the Google product isn’t necessarily better, users eventually migrate to it out of frustration with its more unreliable competitors.

“[Technorati] had good coverage, and it had some good blog-specific features,” Carr told me. “So whether you were doing an ego search or searching for anything, it was the only game in town. Google came out with its blog search and I tried it out, but for awhile I stuck with Technorati just because Google’s didn’t seem quite as comprehensive; it didn’t have quite as many search tools.

“But what I found was that over time — I think about a year ago — I realized I was using Google Blog Search and had pretty much moved away from Technorati. And I think it was really a matter of two things. One is that Google Blog Search didn’t have the kind of technical problems that plagued Technorati. As is the case with most Google search engines, it was really fast. And it also began to be equally comprehensive, and also was very very fast at picking up new postings on blogs.”

Also, Carr found it much more convenient to be able to easily click back and forth between multiple search engines through Google. For instance, you can enter a term on Google’s main engine and then quickly try that same term in its image, news, blog, and (“If I’m feeling really ambitious”) its scholar engine.

“That advantage, which very much comes from Google’s size, I think shows you how difficult it is for a small and specialized player — certainly in the search world but also in other Internet tools and functions as well — how hard it is for them to compete against the big guy,” he said.

Irrelevant Results

That’s not to say that Google hasn’t had its own problems as well. It’s not uncommon for its search results to be heavily cluttered with splogs, sometimes requiring one to flip through several pages of results just to find a handful of legitimate and relevant blog posts.

But recently the search engine faced even more criticism because of a new method it implemented in indexing blog posts; rather than utilizing RSS feeds and just indexing the posts themselves, it began indexing entire pages, including the content on the sidebars. This caused an immediate backlash because it drastically increased the number of irrelevant results that showed up in searches.

“So this means that, for instance, every time JD Lasica adds a new post to his blog at Social Media, which includes Wordyard in its blogroll, I get a new listing in the Google Blog Search for Wordyard, even though the post has nothing to do with Wordyard,” wrote Wordyard blogger Scott Rosenberg recently. “This completely messes up the utility of Google’s search for me — and, from what I see posted by other serious bloggers, many other users.”

I asked Jeremy Hylton, software engineer at Google, about this very problem. I asked him what Google’s reasoning behind the change was and how long ago it had been implemented. He said that Google rolled out the changes in October.

“There are lots of interesting blogs that only provide a short summary of their posts in the feed,” Hylton said. “We’re doing a better job of ranking results because we see a richer link structure and more text. The changes make blog search work much more like web search internally, which will be the foundation for lots of improvements in 2009.”

When I mentioned the complaints that this move had increased the number of irrelevant results, he recognized the problem but at the same time downplayed its significance, saying that it only affected a small number of blogs.

“We saw a few queries that suffered from this problem when we were evaluating the change, but underestimated the number of power users who were affected,” Hylton said. “If you have a very popular blog, then these links are very common. We are working on several ways to fix this problem. We made a change for [link:] queries last week that greatly reduced the number of blogroll results, but didn’t eliminate all of them. The basic idea is to look for text and markup that is common to all the posts of a blog…We don’t get it right for every page, but we’re continuing to improve the algorithm.”

I asked him about Google Blog Search metrics and how many users were flowing in through the blog search engine’s main page, as opposed to Google News or the main engine, but he cited policy that he couldn’t comment on such matters. But even if most of its blog search queries are flowing in from Google News, the site still poses a significant threat to Technorati’s market share.

An Emerging Business Model

Despite Technorati’s past usability problems, the search engine has rolled out a number of changes recently, including putting up much more stringent defenses against spam blogs, thereby decreasing the server burden dramatically. And by changing the way Technorati utilizes blog tags, Carroll argued that the quality of search has improved as well. He said that outages are almost nonexistent these days and that search results are delivered almost three times faster (my unscientific analysis concludes there is some truth to this).

Perhaps even more importantly, the Technorati team — which after all these years numbers fewer than 50 people — thinks that it has figured out a way to successfully monetize the site: through a blog advertising network.

“About a year ago, with Richard [Jalichandra] coming on board [as CEO], we realized we had a tremendous amount of data that’s not only valuable to bloggers and their readers, but also has helped marketers achieve a lot of their goals,” Carroll said. “And over the last year we’ve been trying to refine the underpinnings of our search and crawling feature to make it more efficient and hopefully bring higher quality of results. But at the same time we’re seeing how we can focus on that same data stream toward ad targeting.

As for competing head-on with Google, tiny Technorati would prefer not to go there.

“If you think about it, Google started as a search engine, and Google didn’t have a multi-billion dollar market share because of their search engine; they have that because of AdSense and AdWords, because they have the ability to take the data they have from the entire web and turn it into targeting criteria,” Carroll said. “We’re not trying to compete head to head with Google, but in a similar way in the blogosphere we can take the tremendous amount of data that we have and produce meaningful features for bloggers, for their readers, to allow them to search and discover, but also to help marketers identify the right places to market, not just on Technorati.com, but across the entire blogosphere.”

In this sense, he said, the site’s “identity crisis” has finally come to a close and its employees feel they have a much clearer vision of how to move forward. As for my own search habits, I wandered away from Technorati over a year ago in favor of Google Blog Search. But lately I find myself more and more often returning to my original blog search engine of choice as its results continue to improve. It’s too early to tell whether Technorati will be able to deliver a knock-out punch in the blog search wars, or whether its blog ad network can compete with entrenched players in that field, but the company’s perseverance should be studied by other start-ups that must evolve to survive in the wake of Google’s massive shadow.

Simon Owens is a former newspaper journalist and an associate editor for MediaShift. He currently works as an online analyst for New Media Strategies. You can read more of his writing at his blog or contact him at simon[.]bloggasm [at] gmail.com.

Related