THE AGE of INTELLIGENT MACHINES | Knowledge Processing–From File Servers to Knowledge Servers
February 21, 2001
- Edward Feigenbaum
This chapter from The Age of Intelligent Machines (published in 1990) addresses the history and development of AI, and where it was headed, circa 1990.
Edward Feigenbaum is a Professor of Computer Science and Co-Scientific Director of the Knowledge Systems Laboratory at Stanford University. Dr. Feigenbaum served as Chief Scientist of the United States Air Force from 1994 to 1997.
It has been said that when people make forecasts, they overestimate what can be done in the short run and underestimate what can be achieved in the long run. I have worked in the science and technology of artificial intelligence for twenty years and confess to being chronically optimistic about its progress. The gains have been substantial, even impressive. But we have hardly begun, and we must not lose sight of the point to which we are heading, however distant it may seem.
We are beginning the transition from data processing to knowledge processing. The key tool of our specialty is the digital computer, the most complex and yet the most general machine ever invented. Though the computer is a universal symbol-processing device, we have exploited to date only its mundane capabilities to file and retrieve data (file service) and to do high-speed arithmetic. Researchers in artificial intelligence have been studying techniques for computer representation of human knowledge and the methods by which that knowledge can be used to reason toward the solution of problems, the formation of hypotheses, and the discovery of new concepts and new knowledge. These researchers have been inventing the knowledge servers of our future.
Like all creators, scientists and technologists must dream, must put forth a vision, or else they relegate their work to almost pointless incrementalism. My dream is about the future of AI research and development over the next several decades and the knowledge systems that can be produced thereby to assist the modern knowledge worker.
The Beginnings of the Dream
Fifty years ago, before the modern era of computation began, Turing’s theorems and abstract machines gave hint of the fundamental idea that the computer could be used to model the symbol-manipulating processes that make up the most human of all behaviors: thinking. More than thirty years ago the work began in earnest (1991 will mark the thirty-fifth anniversary of the Dartmouth Summer Conference on Artificial Intelligence). The founding principle of AI research is really an article of faith that the digital computer has the necessary and sufficient means for intelligent action. This first principle is called the physical-symbol-system hypothesis.
The early dreaming included dreams about intelligent behavior at very high levels of competence. Turing speculated on wide-ranging conversations between people and machines and on chess playing programs. Later Newell and Simon wrote about champion-level chess programs and began their work toward that end. Samuel (checker playing), Gelernter (geometry-theorem proving), and others shared the dream.
At Stanford, Lederberg and I chose reasoning in science as our task and began work with Buchanan and Djerassi on building a program that would elucidate chemical structure at a high level of competence: the DENDRAL program. What emerged from the many experiments with DENDRAL was an empirical hypothesis that the source of the program’s power to figure out chemical structures from spectral data was its knowledge of basic and spectral chemistry. For DENDRAL, knowledge was power. Obvious? In retrospect, perhaps. But the prevailing view in Al at the time ascribed power to the reasoning processes-in modern terms, to the inference engine, not the knowledge base. Thus, in the late 1960s the knowledge-is-power hypothesis stood as a counter-hypothesis awaiting further tests and the accumulation of evidence.
Much evidence came in the 1970s. Medical problem solving provided the springboard. The MYCIN program of Shortliffe and others at Stanford was the prototype of the expert-level advisory (or consultation) system. The core of MYCIN was its knowledge base of rules for the diagnosis and therapy of infectious diseases. Its reasoning process was simple (backward chaining), even ad hoc in parts. But MYCIN was built as an integrated package of intellectual abilities. It could interact with a professional in the professional jargon of the specialty. It could explain its line of reasoning. And it had a subsystem that could aid in the acquisition of new knowledge by guiding an expert to find defects in the stored knowledge. Overall, MYCIN provided strong confirmation to the knowledge-is-power hypothesis.
At nearly the same time other efforts in medical problem solving were providing similar results. At the University of Pittsburgh the focus of the Internist project was the construction of an enormous electronic textbook of the knowledge of internal medicine. With its current knowledge base of 572 diseases, nearly 4,500 manifestations, and hundreds of thousands of links between them, Internist has provided the strongest confirmation yet of the knowledge-is-power hypothesis.
In the late 1970s an explosion of expert systems was taking place in fields other than medicine: engineering, manufacturing, geology, molecular biology, financial services, diagnostic servicing of machinery, military signal processing, and many other areas. There is little that ties these areas together other than this: in each, high-quality problem solving is guided by experiential, qualitative, heuristic knowledge. The explosion of applications created a new type of professional, the knowledge engineer (now in extremely short supply) and a new industry, the expert systems industry (now rapidly expanding). One generalization from the frenzy of activity is simply massive additional confirmation of the knowledge-is-power hypothesis. The reasoning procedures associated with all of these systems are weak. Their power lies in their knowledge bases.
Other areas of AI research made shifts to the knowledge base viewpoint. It is now commonplace to say, A program for understanding natural language must have extensive knowledge of its domain of discourse. A vision program for image understanding must have knowledge of the world it is intended to see. And even, learning programs must have a substantial body of knowledge from which to expand (that is, learning takes place at the fringes and interstices of what is already known. Thus, the dream of a computer that performs at a high level of competence over a wide variety of tasks that people perform well seems to rest upon knowledge in the task areas.
The knowledge-is-power hypothesis has received so much confirmation that we can now assert it as the knowledge principle:
A system exhibits intelligent understanding and action at a high level of competence primarily because of the specific knowledge that it contains about its domain of endeavor.
A corollary to the knowledge principle is that reasoning processes of an intelligent system, being general and therefore weak, are not the source of power that leads to high levels of competence in behavior. The knowledge principle simply says that if a program is to perform well, it must know a great deal about the world in which it operates. In the absence of knowledge, reasoning won’t help.
The knowledge principle is the emblem of the first era of artificial intelligence; it is the first part of the dream. It should inform and influence every decision about what it is feasible to do in AI science and with AI technology.
The Middle of the Dream
Today our intelligent artifacts perform well on specialized tasks within narrowly defined domains. An industry has been formed to put this technological understanding to work, and widespread transfer of this technology has been achieved. Although the first era of the intelligent machine is ending, many problems remain to be solved.
One of these is naturalness. The intelligent agent should interact with its human user in a fluid and flexible manner that appears natural to the person. But the systems of the first era share with the majority of computer systems an intolerable rigidity of stylistic expression, vocabulary, and concepts. For example, programs rarely accept synonyms, and they cannot interpret and use metaphors. They always interact in a rigid grammatical straitjacket. The need for metaphor to induce in the user a feeling of naturalness seems critical. Metaphorical reference appears to be omnipresent and almost continuous in our use of language. Further, if you believe that our use of language reflects our underlying cognitive processes, then metaphor is a basic ideational process.
In the second era we shall see the evolution of the natural interface. The processes controlling the interaction will make greater use of the domain knowledge of the system and knowledge of how to conduct fluid discourse. Harbingers of naturalness already exist; they are based to a large extent upon pictures. The ONCOCIN project team at Stanford invested a great effort in an electronic flow sheet to provide a seamless transition for the oncologist from paper forms for patient data entry to electronic versions of these forms. The commercially available software tools for expert-system development sometimes contain elegant and powerful packages for creating pictures that elucidate what the knowledge system is doing and what its emerging solution looks like (for example, IntelliCorp’s KEE Pictures and Active Images).
Naturalness need not rely upon pictures, of course. The advances in natural-language understanding have been quite substantial, particularly in the use of knowledge to facilitate understanding. In the second era it will become commonplace for knowledge systems to interact with users in human language, within the scope of the system’s knowledge. The interaction systems of the second era will increasingly rely on continuous natural speech. In person-to-person interactions, people generally talk rather than type. Typing is useful but unnatural. Speech-understanding systems of wide applicability and based on the knowledge principle are coming. At Stanford we are beginning experiments with an experimental commercial system interfaced with the ONCOCIN expert system.
A limitation of first-era systems is their brittleness. To mix metaphors, they operate on a high plateau of knowledge and competence until they reach the extremity of their knowledge; then they precipitously fall off to levels of utter incompetence. People suffer from the same difficulty (they too cannot escape the knowledge principle, but their fall is more graceful. The cushion for the soft fall is the knowledge and use of weaker but more general models that underlie the highly specific and specialized knowledge of the plateau. For example, if an engineer is diagnosing the failure of an electronic circuit for which he has no specific knowledge, he can fall back on his knowledge of electronics, methods of circuit analysis, and handbook data for the components. The capability for such model-based reasoning by machine is just now under study in many laboratories and will emerge as an important feature of second-era systems. The capability does not come free. Knowledge engineers must explicate and codify general models in a wide variety of task areas.
Task areas? But what if there is no “task”? Can we envision the intelligent program that behaves with common sense at the interstices between tasks or when task knowledge is completely lacking? Common sense is itself knowledge, an enormous body of knowledge distinguished by its ubiquity and the circumstance that it is rarely codified and passed onto others, as more formal knowledge is. There is, for example, the commonsense fact that pregnancy is associated with females, not males. The extremely weak but extremely general forms of cognitive behavior implied by commonsense reasoning constitute for many the ultimate goal in the quest for machine intelligence. Researchers are now beginning the arduous task of understanding the details of the logic and representation of commonsense knowledge and are codifying large bodies of commonsense knowledge. The first fruits of this will appear in the later systems of the second era. Commonsense reasoning will probably appear as an unexpected naturalness in a machine’s interaction with an intelligent agent. As an example of this in medical-consultation advisory systems, if pregnancy is mentioned early in the interaction or can be readily inferred, the interaction shifts seamlessly to understanding that a female is involved. Magnify this example by one hundred thousand or one million unspoken assumptions, and you will understand what I mean by a large knowledge base of commonsense knowledge.
As knowledge in systems expands, so does the scope for modes of reasoning that have so far eluded the designers of these systems. Foremost among these modes are reasoning by analogy and its sibling metaphorical reasoning. The essence of analogy has been evident for some time, but the details of analogizing have not been. An analogy is a partial match of the description of some current situation with stored knowledge. The extent of the match is crucial. If the match is too partial, then the analogy is seen to be vacuous or farfetched; if too complete then the “analogy” is seen as hardly an analogy at all.
Analogizing broadens the relevance of the entire knowledge base. It can be used to construct interesting and novel interpretations of situations and data. It can be used to retrieve knowledge that has been stored, but not stored in the “expected” way. Analogizing can supply default values for attributes not evident in the description of the current situation. Analogizing can provide access to powerful methods that otherwise would not be evoked as relevant. For example, in a famous example from early twentieth century physics, Dirac made the analogy between quantum theory and mathematical group theory that allowed him to use the powerful methods of group theory to solve important problems in quantum physics. We shall begin to see reasoning by analogy emerge in knowledge systems of the second era.
Analogizing is seen also as an important process in automatic knowledge acquisition, another name for machine learning. In first-era systems, adding knowledge to knowledge bases has been almost always a manual process: people codify knowledge and place it in knowledge structures. Experiments by Douglas Lenat have shown that this laborious process can be semi-automated, facilitated by an analogizing program. The program suggests the relevant analogy to a new situation, and the knowledge engineer fills in the details. In the second era we shall see programs that acquire the details with less or no human help. Many other techniques for automatic learning will find their way into second-era systems. For example, we are currently seeing early experiments on learning apprentices, machines that carefully observe people performing complex tasks and infer thereby the knowledge needed for competent performance. The second era will also see (I predict) the first successful systems that couple language understanding with learning, so that knowledge bases can be augmented by the reading of text. Quite likely these will be specialized texts in narrow areas at the outset.
To summarize, because of the increasing power of our concepts and tools and the advent of automatic-learning methods, we can expect that during the second era the knowledge bases of intelligent systems will become very large, representing therein hundreds of thousands, perhaps millions, of facts, heuristics, concepts, relationships, and models. Automatic learning will be facilitated thereby, since by the knowledge principle, the task of adding knowledge is performed more competently the more knowledge is available (the more we know, the easier it is to know more).
Finally, in the second era we will achieve a broad reconceptualization of what we mean by a knowledge system. Under the broader concept, the “systems” will be collegial relationships between an intelligent computer agent and an intelligent person (or persons). Each will perform tasks that he/she/it does best, and the intelligence of the system will be an emergent of the collaboration. If the interaction is indeed seamless and natural, then it may hardly matter whether the relevant knowledge or the reasoning skills needed are in the head of the person or in the knowledge structures of the computer.
The Far Side of the Dream: The Library of the Future
Here’s a “view from the future,” looking back at our “present,” from Professor Marvin Minsky of MIT: “Can you imagine that they used to have libraries where the books didn’t talk to each other?” The libraries of today are warehouses for passive objects. The books and journals sit on shelves waiting for us to use our intelligence to find them, read them, interpret them, and cause them finally to divulge their stored knowledge. Electronic libraries of today are no better. Their pages are pages of data files, but the electronic pages are equally passive.
Now imagine the library as an active, intelligent knowledge server. It stores the knowledge of the disciplines in complex knowledge structures (perhaps in a knowledge-representation formalism yet to be invented). It can reason with this knowledge to satisfy the needs of its users. These needs are expressed naturally, with fluid discourse. The system can, of course, retrieve and exhibit (i.e., it can act as an electronic textbook). It can collect relevant information; it can summarize; it can pursue relationships. It acts as a consultant on specific problems, offering advice on particular solutions, justifying those solutions with citations or with a fabric of general reasoning. If the user can suggest a solution or a hypothesis, it can check this and even suggest extensions. Or it can critique the user viewpoint with a detailed rationale of its agreement or disagreement. It pursues relational paths of associations to suggest to the user previously unseen connections. Collaborating with the user, it uses its processes of association and analogizing to brainstorm for remote or novel concepts. More autonomously, but with some guidance from the user, it uses criteria of being interesting to discover new concepts, methods, theories, and measurements.
The user of the library of the future need not be a person. It may be another knowledge system, that is, any intelligent agent with a need for knowledge. Thus, the library of the future will be a network of knowledge systems in which people and machines collaborate. Publishing will be an activity transformed. Authors may bypass text, adding their increment to human knowledge directly to the knowledge structures. Since the thread of responsibility must be maintained, and since there may be disagreement as knowledge grows, the contributions are authored (incidentally allowing for the computation of royalties for access and use). Maintaining the knowledge base (updating knowledge) becomes a vigorous part of the new publishing industry.
Photo by Lou Jones www.fotojones.com
Photo by Lou Jones www.fotojones.com