You have to hand it to IBM: they know how to put on a good show. As their Watson system won at Jeopardy, human champion Ken Jennings ruefully conceded: “I for one welcome our new computer overlords.” If you want to buy a working overlord for yourself, IBM senior consultant Tony Pearson reckons it will set you back $3 million. That buys you 2,880 POWER7 processor cores, sixteen Terabytes of RAM, the Apache Hadoop distributed file system, Apache UIMA (Unstructured Information Management Architecture) framework, IBM’s DeepQA software and the SUSE Linux Enterprise Server 11 operating system.
The secret of Watson’s success is its eclecticism. It uses its massive parallelism to run hundreds of algorithms simultaneously on the question it’s asked and possible answers. Some of the algorithms are shallow and keyword based; others use deep knowledge of sentence structure and real-world semantic relationships. In the early days of AI this was called a blackboard architecture: a question would be posed to the system; competing processes posted their best guesses as to what it meant and how to respond onto an internal software ‘blackboard’; other processes would then run with those results, pruning and combining; finally critic-processes would score the results and if there was a compelling consensus, would deliver the reply to a waiting world.
Watson has implemented this approach through its Apache open-source platform, which made it easy to deploy hundreds of UIMA annotator components which handle every aspect of the processing. Some of these are Jeopardy-specific while others are completely general-purpose. A detailed description of the Watson system is contained in this PDF article. The Wikipedia article is also helpful.
You have to feel sorry for Doug Lenat over at the Cyc project. For 27 years his team has been labouring to encode the knowledge of the world into a knowledge-representation format which can drive an inference engine. You might call it the ultimate expert system. In less than three years, IBM seems to have overtaken them and the IBMers probably see Cyc as just one more annotator they can plug into their UIMA platform. IBM will now work both to cost-reduce the Watson platform and also to adapt it to other subject domains. We already heard about applications in medicine and the law, but any area where there is a large and authoritative corpus of material should work.
Watson tells us little about human psychology and seems to have generated few deep conceptual insights. But this is not at all to belittle the monumental engineering challenges the IBM team successfully overcame. What Watson truly illustrates is AI pioneer Ed Feigenbaum’s justly-famous 1960s observation: “In the knowledge lies the power.”