Singularity and Rationality: Eliezer Yudkowsky speaks out

August 5, 2010 by Thomas McCabe
Eliezer Yudkowsky

Research Fellow Eliezer Yudkowsky to present his theory of "Friendly A.I." at the upcoming Singularity Summit.

Eliezer Yudkowsky is a Research Fellow at the Singularity Institute for Artificial Intelligence and founder of the community blog Less Wrong. We discussed his coming talk at the Singularity Summit on August 15, his forthcoming book on human rationality, his theory of “friendly AI,” and the likelihood of the Singularity and how to achieve it.

What are you working on currently?

I’m working on a book on human rationality. I’ve got… let me see… 143,000 words written so far. There’s been a lot of progress lately in fields contributing to human rationality, and it hasn’t made its way down to the popular level yet, even though it seems like something that should be popularizable. The second part of the book is on how to actually change your mind, and all the various biases that have been discovered that prevent people from changing their minds. Also, with reference to the Singularity, we’ve discovered in practice that you can’t just sit down and explain Singularity-related things to people without giving them a lot of background material first, and this book hopes to provide some of that background material.

Singularity Irrationality

What’s the most irrational thing you’ve heard regarding the Singularity?

That’s sort of a fuzzy question, because as the word “Singularity” gets looser and looser, the stuff you hear about it gets more and more irrational and less and less relevant. For example, for the people who think that the invention of hallucinogens was a Singularity… I forget who exactly that was [Terence McKenna].

The Singularity Institute once received an email saying, “This entire site is the biggest load of navel gazing stupidity I have ever seen. You are so naive, and clueless as to the inherent evil that lurks forever. A machine is no match for Satan.” I don’t know if that counts as the *most* irrational thing people have said about the Singularity, but…

In terms of what the public accepts as the Singularity, I think that the sort of more naive, “Well, people are still walking around in their biological bodies even after there are superintelligences around, and they’re just sort of being cool and futuristic but it hasn’t completely shattered life as we know it” — that sort of conservatism — may be the silliest thing. I think that’s a failure to understand superintelligence as something that becomes real and will have a real effect on the world.

A Theory of Fun

What are you going to be covering in your talk at the Singularity Summit?

“Positive Futurism” is the tentative subject… roughly, the complex of ideas associated with technology and the Singularity having a real impact on things that are often left out of popular discussions. Possibly cover a bit of Fun Theory, for instance: the questions of “What do we actually do all day, if things turn out well?,” “How much fun is there in the universe?,” “Will we ever run out of fun?,” “Are we having fun yet?” and “Could we be having more fun?” In order to answer questions like that, obviously, you need a Theory of Fun.

For example, in Greg Egan’s book Permutation City, there’s a section in which the protagonist, having achieved immortality and having gone on for some number of millennia, has run out of interesting things to do, and has modified his brain to make himself extremely interested in carving table legs. The question is: Is this what actually happens to you if you achieve immortality? Because, if that’s as good as it gets, then the people who go around asking “what’s the point?” are quite possibly correct.

What’s the alternative to that?

To answer that, you have to ask: How large is the space of novelty, of new things that you can do? For you to start carving table legs, you’d have to run out of things that were more interesting to do than carving table legs. The question is: can we always create new activities that are, not only pleasant, but actually convey new ideas? For example, the first time that you solve a Rubik’s Cube, you’ll learn new concepts, like composing several moves into an operator. If you then encountered a different version of the Cube that was 4×4 or four-dimensional, it wouldn’t necessarily teach you new ideas that were as deep as the idea of composing several moves together. So once you encounter the Rubik’s Cube, is that it? Is that the high point of your life? I have a heuristic argument against that; by looking at the transition from chimps to humans, we can see that a small increase in brain size caused a much larger increase in Fun Space. From this, I argue that, while we may eventually exhaust the energy in the universe, we aren’t going to run out of fun before we run out of energy.

The Limits of Human Intelligence

What is your estimate of the probability of reaching the Singularity?

If we define “Singularity” as simply the creation of smarter-than-human intelligence, I think I will go along with the cowards and refuse to put a probability estimate on that sort of thing. It looks like the Singularity is survivable — that if you know what you’re doing and design the AI correctly, you win. I don’t yet have reason to believe that the problem is humanly unsolvable. The difficult part is solving it in the next few decades. If we could just try over and over until we got it right, and we had 200 years to get it right, I’m very confident we could get it within two centuries. But the real world isn’t like that; in the real world, if you get it wrong on the first try, you die. And if you take too long, someone else just goes ahead and builds an AI. That’s what makes the problem difficult; the good guys have to get it right on the first try, and there are various other people tackling the intrinsically easier problem of building an AI regardless of whether it’s safe.

It seems that you’re talking specifically about the scenario where unenhanced, modern-day humans solve the “friendly AI” problem directly. What probability would you put on humans enhancing our own intelligence through biotechnology or some other means, and then solving the FAI problem with enhanced brainpower?

That’s a lot harder to figure out. We haven’t yet seen many cases in which little, unregulated bits of research get somewhere, find a market for themselves, and successfully make a profit. And it doesn’t seem like there’s much underground medical research going on. For these reasons, I suspect that human intelligence enhancement isn’t going to show up and push people beyond the current smartest existing humans, before the AI issue resolves itself one way or the other. We might find something that lets you go from IQ 100 to 110, but to push beyond the limits of the smartest existing humans is a lot more difficult.

Very smart people are already operating near the boundaries of the human design space, and if you try to push them out any further, their brains might just fall apart entirely. There might be modifications we can engineer that evolution couldn’t, and the people working on AI might have some of those modifications. But I doubt they’d be beyond human in any interesting sense. From my perspective, I just tend to focus on recruiting the smartest existing people, because of the large amount of capital required to develop human intelligence enhancement technology.

Collective Intelligences

What about the option of building a collective intelligence, and enhancing it by better methods of communication, for example?

I’m skeptical of that. The single greatest success story I know of collective intelligence was Kasparov versus the world, where Kasparov played one side of a chess game, and the other side was played by a collective intelligence, and Kasparov won. The collective intelligence was built by 50,000 people who could use whatever sort of computer programs they wanted, and a single person still beat them.

There are good theoretical reasons to think that, within a single brain, you can have complete bandwidth: everything is connected to everything else. When people communicate with each other, they’re communicating at a much lower bandwidth. People don’t add. A human has four times the brain size of a chimp, but four chimps don’t equal one human.

The sort of scenario we’re talking about is something like Enlightenment version 2.0, a new kind of civilization, instead of something like the jump from chimps to humans. We could use rationality to what you call “raise the sanity waterline” of civilization, and then this new, more rational civilization would be capable enough to actually tackle the Friendly AI problem and succeed.

Well, one of the reasons I’m writing this book on rationality is so people can realize things like, “You can use a bunch of positive phrases to describe something, but does it actually work in practice?” You can use phrases like “Enlightenment 2.0” and “democracy,” but what, in practice, it would work out to is that we might have a civilization which contains slightly fewer people who were smart enough to build AI and dumb enough to try to do so. We could have people giving them disapproving stares, saying “How could you possibly do something that stupid, you idiot.” You could have less venture capital for projects that are going to destroy the world, you could have more resources and more candidates to work on the Friendly AI problem, but in the end, the problem itself would still be solved by a small team of professional specialists.

The idea isn’t that the “power of democracy” or the “folksonomy” would somehow solve the friendly AI problem, but that, if our civilization were more sane, we’d be able to throw 5,000 people at the problem, and have the team of 5000 actually be effective and not just get in each others’ way.

I don’t know that there’s enough room on the problem for five thousand people to work on it.

So your example of a chess game is a relatively constrained problem. When the problem is more varied, the work can be split up into subsets that different people can handle more efficiently.

Right, but Artificial Intelligence isn’t a case where we already know what we need to do, and there’s just a bunch of niggling problems that need to be solved. The problem is understanding what to solve — there are basic issues that we don’t understand with intelligence. And I’m not really familiar with many cases in history where basic problems have been solved by splitting them up into pieces and farming them out to thousands of people. Folding@Home works great for protein folding, where you understand what a protein is and the question is how to fold it. You can’t have people working in parallel on problems that you don’t know how to split into pieces.

There’s an awful lot of futurism out there, and as you add on more and more details to a story about the future, it becomes more and more plausible-sounding, while becoming less and less probable, through the conjunction fallacy.