Google Announces Plans to Allow Users to Access Libraries
[Sorry, the video for this story has expired, but you can still read the transcript below. ]
RAY SUAREZ: It is the beginning of a virtual global library online. Google, the world’s most popular search engine, announced today it plans to let people directly access millions of books online from several top libraries in America and abroad.
The university libraries included in the initial project are: Harvard, Oxford, Stanford, and Michigan. The database will also include books from the New York City Public Library. The amount of data available from each library will vary.
Readers will be able to type phrases and words into the Google search engine, and then be linked to sections of text from library books.
For more about the impact of this plan, we turn to Paul LeClerc, the president and chief executive of the New York Public Library; and Jason Pontin; he’s the editor-in-chief of the MIT Technology Review.
Paul LeClerc, let’s start with you. If I’m a high school kid writing an American history paper, a hobbyist or enthusiast looking for my passion on line, what will the creation of this place mean for me?
PAUL LE CLERC: It’s going to mean an enormous amount because for the first time the New York Public Library and many other libraries are going to be able to bring very, very substantial portions of their collections – for us those in the public domain — to a worldwide audience seven days a week, twenty-four hours a day.
And that strikes me as being the beginning of a very substantial revolution in the way that we can distribute information to a global audience.
RAY SUAREZ: Well, a colleague of yours said today, Daniel Greenstein from the California Digital Library of the U.C. system, that our world is about to change in a big, big way. What does it mean for the New York City Public Library to be available in this way?
PAUL LE CLERC: Well, the world is going to change and the world has been changing very dramatically over the course of the last ten years, minimally ever since the World Wide Web came into being and libraries started, as we did, to put our collections on line, but this is going to blow open things in ways that are hard to even imagine I think over the course of the next decade.
On the one hand, libraries are going to continue, at least our library is going to continue to do the traditional things of collecting massive amounts of information on paper and making that available to readers on reading rooms.
So we’re not abandoning the traditional functions of a library that have been around for five thousand years.
At the same time, somebody who for one reason or another can’t come to the library, doesn’t have the time, doesn’t have the capacity for a variety of different reasons, is going to be able to enter into a collection and read it and search through it in a highly efficient free manner, and so what we’re really talking about here is the liberation of enormous amounts of information and a provision of that information instantaneously to audiences all over New York, all over the region, all over America, and really all over the world.
RAY SUAREZ: Jason Pontin, we’re talking about scanning millions upon millions of books and making them available through a search engine. What’s in it for Google? How does this match with their business model of selling ads and guiding people to Web pages?
JASON PONTIN: Well, Ray, when Google went public this year, they promised to organize all information, and at the time it was thought to be at the most Greek bravado, at best a plea for a better public offering.
But it turns out that they meant it. Sergein Brin, one of the founders of Google once told me he hoped Google could be like the mind of God, everywhere and knowing everything. And that’s their goal. They want to organize everything.
For them, the business goal is that whenever you do a search, they’ll earn revenues by also providing through complicated algorithms links to advertisers, as well, but more than that, I think for Google it’s their pet project. They came out of Stanford’s library program.
RAY SUAREZ: This is going to be a very expensive thing, digitizing all those books. Is that initial public offering money part of what’s paying for this?
JASON PONTIN: Well, they earned $2 billion in that public offering. It’s very interesting. If you talk to technologies, they would say in order to search the public Web, you only need say 3,000, 8,000 computers doing the scanning, but over the last two years, Google has accumulated 250,000 very powerful computers.
At the time, no one quite knew what it was for. People talked about toys for the boys. Now we know what it’s for. It’s to create this universal library, which Paul and you have been discussing.
RAY SUAREZ: You make this thing that democratizes knowledge, as Paul LeClerc was talking about. But I’m still not clear on how Google downstream makes any money from having done this, for making it possible for you to sort of rifle the stacks at Oxford or Stanford.
JASON PONTIN: Sure. Google’s business model is actually very simple. It earns very little money from providing their search services to third parties like libraries. In this case, as you say, they’re doing it for free.
When you go on Google and do a search, look to the right-hand side of the screen. You’ll see a number of advertisers pop up. Those advertisers have paid for what’s called that relevance, so let’s say I wanted to go and buy a car on the… or in the case of the library, I wish to go and read Melville, I might be taken to Amazon’s site on the right, as well, because Amazon would have an interest in letting me buy the book myself.
About 50 percent of Google’s revenues have always derived from advertising, and they anticipate it will simply continue.
RAY SUAREZ: So Paul LeClerc, I walk through the portal that Google provides. It takes me to a book I’m looking for. Can I find the whole book? Does it exist on your shelves? Or am I only going to get a taste or an excerpt?
PAUL LE CLERC: Well, the file that will be created through this partnership is going to exist in two different places. On the one hand, it will be on the New York Public Library Web site, and the full text will be available to readers there.
What Google provides is an enormously important capacity to search the text for key words and images and so on and so forth. And I think that on the Google site, one will have a portion of the book that contains the kinds of things that you’re interested in.
Supposing you want to know what Shakespeare had to say about peace or war or love; you type in key words and do a search of all the things that will be online that have to do with Shakespeare or commentaries on Shakespeare, and you can read hundreds, thousands, tens of thousands of documents or have them read for you by the search engine and pull out all the things that are relevant to what you’re interested in.
So it’s a very different kind of reading and it’s a very different kind of service than the traditional one. So these files will exist in two places and can be used two ways: Reading the full text, if you like to, on the New York Public Library Web site, or searching that text in many, many other texts through the facility that’s made possible by Google and Google’s search engine.
RAY SUAREZ: Should people who write books or people who sell books be worried about this?
PAUL LE CLERC: For those of us who write books, and I do myself, the kind of efficiency that this is going to provide is simply staggering in its importance because the traditional ways of doing a search are going to be transformed very, very dramatically.
And from my vantage point and the vantage point of the New York Public Library that is in the business of giving information away to the broadest audience conceivable, increasing the efficiency by which that information can be delivered is an important part of our mission, and is a very, very, is a key element to us in this particular partnership.
RAY SUAREZ: Jason Pontin, same question. If you want to write a book that people are going to go buy and they can just sit at home in their pajamas and read it for free, should you be worried if you’re one of those businesses also on line that’s going to sell people books, instead, should you be worried?
JASON PONTIN: It’s not clear yet exactly how this news service will play out. Google already has a product called Google Print that offers access in part to copyrighted text.
I imagine that they will do at least initially no more than that. So books that are already in the public domain will be publicly available. Copyright is the essence of intellectual creation, and I would be astonished if Google were to damage that in any significant way.
RAY SUAREZ: Will assembling these massive databases require the development of new technologies, new machines for turning all these books into digital form?
JASON PONTIN: That’s an interesting question. Google has actually been rather reticent to say how it’s going to be done, except to promise that they will not damage any of the books as they scan them in.
Now, we know that in the Google headquarters, the entire Stanford Library is going to be carted in, in large trucks. They claim they’re going to do 50,000 pages a day from Stanford. That sounds like a enormous amount, but when you’re talking about moving 15 million books online, it could be quite slow.
RAY SUAREZ: So, quickly, before we go, Paul LeClerc, we’re looking at what, five, seven, ten years before your entire library is on line?
PAUL LE CLERC: Well, what we’ve signed now is an agreement to do a pilot project. And we see this as a vestibule.
And at the end of the vestibule is a very, very big room that contains the public domain material of the New York Public Library that we could decide to put online in the partnership. So we’re doing some tests; we’re getting our feet wet. We’re experimenting.
We see this as a very exciting initiative and if we win for a very big project that would have put millions of public domain volumes in the New York Public Library’s collections on line, that’s a multi-year effort of enormous consequence, simply enormous consequence to the world of learning and the world of information distribution.
RAY SUAREZ: Gentlemen, thank you both.
PAUL LE CLERC: Thank you.
JASON PONTIN: Thank you.