How IBM Built Watson, Its 'Jeopardy'-Playing Supercomputer


It appears not even computers are immune from getting their proverbial 15 minutes of fame.

In this case, the IBM (IBM) supercomputer Watson will get its chance to grab the spotlight later this month. That's when it goes head-to-head with Jeopardy's two record-holders, Ken Jennings and Brad Rutter, in a classic man-against-machine contest. But beyond its near-human-like ability to sift through realms of information, parcel out the relevant parts and determine the likelihood of an answer being correct, Watson's brush with fame may help IBM attract more customers to the growing area of business analytics software.

Watson, with its 90 IBM Power 750 servers laced together, intones its answers with a robotic voice coming from a black rectangular box with a flashing globe stationed between the two Jeopardy rivals and its cluster of computers (the size of eight refrigerators) stationed behind its rivals. Watson is no mere wannabe: It won its practice round of the TV game show last month.

IBM engineers designed Watson to show how computer systems can analyze and process natural language, and reach predictions or answers. And much like humans, Watson relies heavily on context.

For example, take the word January. Most folks would assume that word's topic involves a month of the year. But how about the word Ramadan? People would also need to consider the context in which it's used to realize it also involves a month of the year.

Learning From Its Mistakes

"While we can get computers to follow complex reasoning tasks, it's very difficult to get them to do that over natural language," says David Ferrucci, principal investigator of Watson DeepQA technology for IBM Research. "And yet, that's where all the knowledge is, because all human knowledge is sort of naturally and prolifically created by humans to communicate with other humans."

He says Watson's servers can handle processing 500 gigabytes of information a second, the equivalent of 1 million books, with its shared computer memory. And according to Ferrucci, Watson's software is wired for more than handling natural language. Its machine learning allows the computer to become smarter as it tries to answer questions -- and to learn as it gets them right or wrong.

"It wasn't just any one algorithm that popped up and created change, but rather, I think you're seeing a combination of things happen," Ferrucci says. "Computers have access to so much language. And when you have access to lots and lots of language, you start to learn that these five paragraphs, for example, really say the same thing. To me that's important because my computer starts to learn semi-automatically, or completely automatically."

Training With Off-the-Shelf Info

Watson also has the capacity for knowledge representation and reasoning, which is essentially learning by context and employing language flexibility, like the intellectual flexibility needed in the January and Ramadan example. And, of course, it's also packed with deep analytics software for data-mining.

All this allows Watson's cluster of computers and stack of interlinking software to run a number of computations simultaneously, drawing on the bandwidth of some, or all, of the computers.

Prepping Watson to play Jeopardy required an additional layer of work. It began by pumping the machine up with off-the-shelf, as-is information from encyclopedias, dictionaries, plays and books.

Sponsored Links

"Watson had to be self-contained, so it couldn't be connected to anything else that wasn't right there in that room," Ferrucci says. He adds that the challenge for Watson was managing its own information, so it knew what it needed and what it didn't need. The trade-off was the more information Watson needed to compute, the more power it required.

"With Jeopardy, you can't anticipate all the questions that might be asked because the domain is very broad," Ferrucci says, noting it's also difficult to determine how the questions will be asked. "There is no simple template where everything fits in nicely, and you can query some database."

Watson is hard-wired into the electronic Jeopardy boards that send it text or information to ask the question, rather than having it rely on voice-recognition software to capture the question, Ferrucci says. Once the computer decides it wants to answer a question, Watson's hardware lets it push the Jeopardy button to answer.

Here's where Jeopardy players Jennings and Rutter will have the slight advantage in actually listening to the question being asked: They can anticipate the answer and buzz in immediately, Ferrucci says. Although Watson can't listen to the question and become trigger-happy with anticipation in pushing the button, its machine learning should make it more consistent in getting right answers.

The Supercomputer as Super Ad

Watson's participation in the Jeopardy challenge is expected to boost IBM's software and services business, analysts say. The company posted revenues of $29 billion in the fourth quarter, of which software brought in $7 billion and global technology services roughly a third. And business analytics revenue across Big Blue's software and services operations rose 19% over the previous year.

"Watson will help promote IBM's business intelligence, data analytics and data-warehousing software sales. It's all the software you want to use as a business solution," says Louis Miscioscia, an analyst with Collins Stewart.

He says Watson's expected high-profile performance next week will provide business executives with a demonstration of how its technology can query data and provide answers to complicated questions.

Having a computer handle natural language is a technology with broader implications, Ferrucci says. "It's getting access and learning how to reason over all these other kinds of content that are linked to knowledge. . . . That's why we got into this."

For example, Ferrucci says, Watson will serve as a way to tout IBM's analytics portfolio -- to show how unstructured information can be mined and cross-correlated in a big way. He also notes that Watson's work will be a boon for IBM's information management and artificial intelligence, demonstrating how data presented in natural language can be retrieved within seconds.

Heading to the Server Farm?

Ferrucci declines to speculate on any future revenue Watson may generate from IBM, other than to note that Watson won't make money from Jeopardy. The impact on information management will come more from how Watson solves Jeopardy questions.

And what's in store for Watson after Jeopardy? The server farm?

"We hope to continue to use Watson for research and to help with the adaption of health care and other industries," Ferrucci says. "We're not going to put him out to stud."

Get info on stocks mentioned in this article: