The victory by IBM's Watson over all-time champions Ken Jennings and Brad Rutter that viewers of Jeopardy! saw this week is a milestone for those of us in the field of computer science known as Question Answering, or QA. It's not that we didn't see this coming; as a consultant for IBM's DeepQA team, I've seen Watson beat too many qualified opponents in evaluation rounds to be surprised by this outcome. But Watson is the first system with enough speed, accuracy and confidence to compete with humans at a real-time QA task like Jeopardy!.
Watson in training. Image courtesy of IBM.
The drama of this "man vs. machine" match gives our field a higher public profile and a jolt of credibility that will help us to promote a more effective way of interacting with computers using natural language.
Question Answering isn't a new line of research. The idea of asking a question of a computer in English and receiving a precise answer is something that people, and science fiction writers in particular, have thought about since the dawn of the computer age. One of the earliest QA systems, LUNAR, was built by Bill Woods and his team at Bolt, Beranek and Newman in the 1970's to help scientists retrieve data about Moon rocks. More recently, the explosion of the Internet and on-line information has led to great advances in document retrieval, and search engines like Google have become our default means of access to on-line text.
The big difference between a QA system like Watson and a search engine like Google is that Watson can read the text for you and provide a precise answer. Google will just give you a list of documents it thinks might contain the answer; you have to do the reading and answer-spotting yourself.
It's easy to see why question answering is a more effective way for humans to retrieve specific pieces of information. But until Watson came along, QA systems were typically too slow, too limited to a specific area of knowledge, or too inaccurate to perform well on an unrestricted, real-time QA task like Jeopardy!.
The significance of Watson goes beyond public perception to include some real technical advances. Like other QA systems, Watson isn't a single computer program, but a very large number of programs running simultaneously on different computers that communicate with each other. To answer Jeopardy! clues quickly and accurately, Watson must generate thousands of possible answers; for each candidate answer, Watson reads through additional text related to that answer to determine if it's correct. An old-fashioned QA system that checks one answer at a time would take hours to answer a Jeopardy! clue, whereas Watson can answer a clue in just a few seconds.
This method represents an important breakthrough for question answering systems because it allows QA applications to boost accuracy (by checking more possible answers) without sacrificing response time. The good news is that DeepQA isn't specific to Jeopardy!, and is already being used to develop new QA systems for business applications like health care management.
Watson has been designed to answer questions of general knowledge which are asked in the tricky, peculiar Jeopardy! style. But the greatest potential for QA systems in general lies in navigating areas of deep, specialized knowledge. IBM sees a market for QA in health care, but these systems also could be used to unlock the complexities of legal databases or those of various scientific disciplines. We are generating huge amounts of unstructured text knowledge - text that must be read to be understood - at unprecedented rates today, and the need for high-quality question answering to help us to pinpoint the answers we need has never been greater.
Watson's unique achievement provides compelling evidence that QA systems stand ready to meet the challenge, but it also underscores the need for further research advances. Anyone who has studied Watson's matches knows that it gives wrong answers that no human contestant would utter, and that many Jeopardy! categories elude Watson almost entirely. We also can anticipate that Watson's success will push researchers to find ways to rapidly and cost-effectively produce systems with Watson's level of performance that are adapted to new domains of knowledge.
Pursuing these fundamental advances in question analysis and answer scoring while creating machine learning tools for rapid domain adaptation are hot research topics. No one is more inspired by Watson's achievements than those of us in the QA research community, and that inspiration will drive us forward as we strive to attain the full potential of this technology.
Publicist's note: For the inside story on Watson, watch NOVA's Smartest Machine on Earth online.