A Formula for Intelligence: The Recursive Paradigm
August 6, 2001 by Ray Kurzweil
An explanation of the recursive approach to artificial intelligence, written for “The Futurecast,” a monthly column in the Library Journal.
Originally published September 1992. Published on KurzweilAI.net August 6, 2001.
There is a profound satisfaction in simple explanations that can truly account for complicated phenomena. The search for unifying formulas has been a goal of science since its inception with the Ionian Greeks 25 centuries ago.
Is there a formula that describes, explains, or underlies intelligence? At first, the answer might appear to be an obvious “No.” We have not been entirely successful in even defining intelligence, much less in expressing it in a formula, a set of laws, or models. Intelligence would seem to be too complex for such reduction.
We should not be too hasty, however, in rejecting the idea of simple paradigms underlying the undeniably complex phenomenon of intelligence. Let us examine the playing of chess as an example of an intelligent activity. Chess is often used as a prototype for examining issues in intelligence. Scientist Raj Reddy cites studies of chess as playing the same role in artificial intelligence (Al) as studies of the bacterium E. Coli play in biology: an ideal laboratory for studying fundamental questions. I will describe a strikingly simple method for playing an outstanding game of chess. We will then see that the method expands to a remarkably broad array of intelligent tasks.
Chess is a game of two opponents in which the players take turns adjusting the positions of their pieces on a playing board according to prescribed rules. Of course, many other games can be described in the same way. Many activities in real life, such as war and business, have similar characteristics. Indeed, chess was originally formulated as a model of war.
To win, or provide the highest probability of winning, one selects the best possible move every time it is one’s turn to move. The question, then, is what is the best move? Our “recursive” method provides the following rule, which, if you follow it carefully, will enable you to play an exceptionally good game of chess:
Every time it is your move, select your best move on the assumption that your opponent will do the same.
As I believe will become clear, this rule is all that is needed to play an excellent game of chess. If this is so, then we might conclude either that the recursive formula is a powerful and deceptively simple formula for the algorithm of at least some forms of intelligence, or, alternatively, that chess is not an intelligent game.
What you gonna call?
Before delving further into the implications of the recursive formula, let us examine how it works. We fashion a program called “Pick Best Move.” When it is called, its job is to pick the best move. It is what we call a recursive program because it is capable of calling itself. This capability, called recursion, is perfectly feasible in modern programming languages and is commonly used in AI applications.
In order for Pick Best Move to select the best move, it must obviously consider each move, and thus it needs to generate a list of all possible moves. It does this by simply applying the rules of chess. It thus includes a mechanism programmed with the rules of chess to generate its options. To play a different game, we simply replace this module with another one programmed with the rules of that particular game.
We now have a list of possible moves. We examine each in turn and ask the question: Does this move enable me to win or lose? If the answer is lose, we do not select that move. If the answer is win, we take that move. If more than one move enables us to win, it does not matter which one we take.
The problem now reduces to answering the question: Does this move enable me to win or lose? At this point we note that our winning or losing is affected by what our opponent might do. A reasonable assumption is that our opponent will also choose his or her (or its) best move. We need to anticipate what that move might be, so we use our own ability to select the best move to determine what our opponent is likely to do. In this we are following the part of the recursive rules that states, “Select the best move on the assumption that your opponent will do the same.”
Pick Best Move
Our program is now structured as follows. We generate a list of all possible moves allowed by the rules. We examine each possible move in turn. For each move, we generate a hypothetical board representing what the placement of the pieces would be if we were in fact to make this move.
How are we to do this? It turns out we have a program that is designed to do exactly that. It is called “Pick Best Move.” Pick Best Move is, of course, the program we are already in, so this is where the recursion comes in. Pick Best Move calls itself to determine what our opponent will do. When called to determine the best move for our opponent, Pick Best Move begins to determine all of the moves that our opponent could make at this point. For each one, it wants to know how its opponent (which is now us) would respond and thus again calls Pick Best Move for each possible move of our opponent to determine what our response to that move would (or should) be.
The program thus keeps calling itself, continuing to expand possible moves and countermoves in an ever expanding tree of possibilities. The next question is: Where does this all end? Let us start with an attempt to play perfect chess. We continue to expand the tree of possible moves and countermoves until each branch results in an end game. Each end game provides the answer win, tie, or lose. Thus, at the furthest point of expansion of moves and countermoves, some moves finally finish the garne. These final moves are the terminal leaves of our expanding tree of moves. Now, instead of continuing to call Pick Best Move, we can now begin returning from these calls. As we begin to return from all the nested Pick Best Move calls, we have determined the best move at each point (including the best move for our opponent), and so we can finally select the correct move for the current actual board situation.
The above procedure is guaranteed to play a perfect game of chess. This is because chess is a finite game. Interstingly, it is finite only because of the tie rule, which states that repetition of a move results in a tie. If it were not for the tie rule, then chess would be an infinite game (the tree of possible moves and countermoves could expand forever), and we could not be sure of determining the best move within a finite amount of time. Thus, this very simple recursive formula plays not just an excllent, but a perfect game of chess. The most complicated part of actually implementing the recursive formula is generating the allowable moves at each point. Doing so, however, requires only a straightforward codification of the rules. Playing a perfect game of chess is thus no more complicated than understanding the rules.
Trimming the tree
When shopping for services like car repair, the smart shopper is well advised to ask: How long will it take? The same question is quite appropriate in applying the recursive formula. Unfortunately, with respect to playing a perfect game of chess, I have computed the answer to be approximately 40 billion years. And that is just to make the first move!
Playing perfect chess might be considered intelligent behavior, but not at 40 billion years per move. Before we throw out the recursive formula, however, let us attempt to modify it to take into account our human patience (and mortality). Clearly, we need to put limits on how deep we allow the recursion to take place. How large we allow the move/countermove tree to grow should depend on how much computation we have available. In this way, we can use the recursive formula on any equipment, from a pocket computer to the largest supercomputer.
If we cannot expand each sequence of moves and countermoves to an end game, then we need some way of evaluating the desirability of a board short of winning, losing, or tying. As it turns out, simple methods such as counting the values of each piece (ten for the queen, five for a rook, etc.) do quite nicely. We thus end up with an algorithm that does indeed reduce to a simple formula and is capable of playing a very good game of chess, with reasonable (and in tournament play, acceptable) response time.
The next question is: How good a game? In order to make linear progress with the recursive formula, we need to increase the available computation exponentially (for each additional move or countermove that we want to look ahead, we need to multiply the available computation by approximately eight).
However, as I have pointed out several times in The Futurecast”, we are indeed making exponential progress in the power of our computers with the linear passing of time. And indeed, computers have been gaining in their championship ratings at a remarkably consistent 45 points each year. They are now rated at about 2600, exceeded by only a few dozen humans. The world champion has a rating of 2800, so it is only a matter of a few years before computers overtake all human play.
The recursive formula can be (and has been) applied to many other types of tasks. In addition to board games, the recursive paradigm has been successfully applied to solving a wide range of problems in mathematics, evaluating strategies in finance and business, and many other domains.
The three levels of intelligence
Is there a limit to the ability of this simple algorithm to emulate intelligence? In practical applications of the recursive paradigm to a variety of tasks, intelligent problems appear to divide into three levels or classes. At one extreme are problems that require little enough computation that they can be completely analyzed. Examples are tic-tac-toe and such word problems as cannibals and missionaries. In the middle are problems for which we cannot afford fun analysis but that appear to be successfully dealt with using a simple implementation of recursion. By “successfully dealt with” I mean that it is possible to match the performance of most (and in some cases all) humans. Chess is in this second class. Finally, there is a third class of problems that are not amenable to recursive solutions, at least not using simple procedures at the terminal leaf. In chess, when we stop expanding the tree of moves and countermoves, we can use a simple procedure of just counting up piece values (or some other simple method). If instead we need to perform a “deep” analysis of the quality of the board, then we would not be relying just on recursion, but on this deep analysis. Chess is a “level 2″ problem, so recursion alone does appear to be sufficient But level 3 problems require something more.
Interestingly, the Japanese board game of Go appears to require just this sort of deep analysis. The sequence of moves and countermoves to make meaningful progress in a game of Go is so long that the recursive method alone does not play a very good game. Certain realms of mathematics, as well as many practical problems, appear to require a means of making judgements beyond the recursive expansion of possibilities. So our next question becomes: Is there a simple formula to make these deep judgments for level 3 intelligent problems? As it turns out, there may be. These deep judgments can be considered pattern recognition judgments, the ability to recognize patterns in new situations based on what we have learned from exposure to previous patterns. It involves learning and the ability to apply learning to new problems. A paradigm called “neural nets” is another simple yet powerful organizing principle that appears to be capable of making intelligent judgements when given the opportunity to learn from many previous examples of problem solving. It is based on a model of how neurons, the basic computational engine of the human brain, perform computation and has shown great promise in making just the sort of deep pattern recognition-based judgments required for level 3 problems.
Interstingly, when humans play chess, we do not use the recursive paradigm to any great extent, but rely instead on our ability to recognize pattems learned from previous experience. Humans thus use a level 3 technique to solve a level 2 problem.
Reprinted with permission from Library Journal, September 1992. Copyright © 1992, Reed Elsevier, USA