WHEN THINGS START TO THINK | Chapter 11: The Nature of Computation

May 15, 2003

Originally published by Henry Holt and Company 1999. Published on KurzweilAI.net May 15, 2003.

HOW. . . will things that think be developed? by taking advantage of what nature already knows how to do. Through research that maximizes contact, not constraints. With education that happens as it is needed, rather than in advance. By interconnecting systems of people and things to solve hard problems.

Information technology can be thought of as a pyramid. At the topare the supercomputers, the biggest, fastest machines, bought bythe biggest, richest institutions. Then come the large mainframesthat run companies. After that are the servers for a department,high-powered workstations for demanding applications, and then thecommon PC. As the cost goes down, the number of these things inuse goes up.

Below the PC are emerging devices that promise to be used in stilllarger numbers.

  • Supercomputer – $10,000,000
  • Mainframe – $1,000,000
  • Server – $100,000
  • Workstation – $10,000
  • Personal Computer – $1,000
  • PDA – $100
  • Watch – $10
  • Smart Card – $1
  • RFID Chip – $0.1
  • ????? – $0.01

A few hundred dollars buys a Personal Digital Assistant, a pocket-sizedcomputer handy for mobile information access. For tens of dollars,the wristwatch has for many years been the one processor and displaythat people carry around with them. It is now acquiring new capabilities,such as a pager, or a GPS receiver, or the Swatch Access watchesused as electronic ski-lift tickets. Europe is also leading theway with smart cards, computers the size of a credit card that areused much the same way, but that store electronic cash instead ofrequiring access to a remote computer. They were first used to eliminatethe need for change to use pay phones, and are now spreading toall areas of electronic commerce. Finally, for tens of cents comethe RFID chips that are used to tag things with an electronic identity.These are used in the badges that control access to a workplace,and in tracking laboratory animals. And that’s where it stops. Noone’s been able to fabricate, package, and sell chips for less thanabout ten cents. While that might not appear to be a severe limit,think about what it leaves off: just about everything!

One of the most frequent challenges presented to me is findinga way to compute for pennies. Companies have a need to track theidentity, location, and environment of the things they make, butin competitive markets they can’t afford to add more than a fewcents per item. While it would be nice to be able to walk out ofa supermarket without waiting in line because the groceries cometagged with chips that can automatically be scanned by the door,or have your cupboard at home contact the grocery store when itneeds restocking, these things won’t happen if the silicon chipsadd more than a few cents to the cost of the potato chips. A businesscard that can call up a business’s Web page would be convenient,but there wouldn’t be a business left if its cards cost more thana few cents.

This is such an intensely commercial question that it becomes aninteresting scientific one. It’s not possible to reach one-centcomputers just by cutting corners in the fabrication of conventionalchips; the whole industry has been trying unsuccessfully to do thatfor many years. The only way to get there is to be smarter abouttaking advantage of what nature already knows how to do.

A clue is provided by the cheapest digital device of all, the shopliftingtag. This is a one-cent one-bit remotely read memory device, storingthe status of a product (purchased vs. stolen). Many of these, suchas the white rectangular strip that you find on a compact disk,are made by a company called Sensormatic. These contain a metallicglass, a thin sheet of a metal alloy that was cooled so quicklywhen it formed that the atoms were frozen in a random arrangement.It looks like a piece of aluminum foil, but when it is exposed toan oscillating magnetic field it flexes slightly and then ringsmuch like a tuning fork. This resulting oscillation is much toofast to be audible, but it in turn creates an oscillating magneticfield that can be detected. The portals around the door at a CDstore contain the coils for exciting and detecting these oscillations.

My student Rich Fletcher was working with Sensormatic on squeezingmore functionality into these materials by trimming them to varyinglengths so that they would oscillate at different frequencies, andsensing the motion of magnets near them by how they shift the frequency.This let him store a few bits in the tag, and use it as a passivewireless user-interface controller. By putting the coils in a tableinstead of a door, Rich could create a work space that could distinguishamong objects brought in, and detect how they were moved. Doingthat isn’t remarkable, but doing it for a penny is. At that costthis capability could be put into most anything, suggesting thatsmart materials might be able to replace dumb chips. Indeed, Richhas since found many other materials that can be remotely sensedto determine their identity, track their position, and detect propertiessuch as pressure and temperature.

This approach rapidly runs into a limit on the amount of informationthat can be stored in a tag because there are not enough distinguishablefrequencies. Very little data is contained in the call sign of aradio station; the interesting part is the signal the station broadcasts.For a smart material to be able to send out a more complex signalit needs to be nonlinear. If you hit a tuning fork twice as hardit will ring twice as loud but still at the same frequency. That’sa linear response. If you hit a person twice as hard they’re unlikelyjust to shout twice as loud. That property lets you learn more aboutthe person than the tuning fork.

I set out to find a nonlinearity that we could use to store moredata in smart material tags. To keep the cost down to a penny ithad to be a natural property of the material rather than somethingthat would require elaborate fabrication. An intriguing candidateis the interaction between the atoms in a material. Chemists studythese with the technique of nuclear magnetic resonance (NMR). Magneticresonance imaging (MRI) uses exactly the same mechanism, minus theword “nuclear,” which might scare patients.

NMR is performed in much the same way as shoplifting tags are detected,applying an oscillating magnetic field and then listening for theresulting oscillating field. In the tag the oscillation came froma mechanical vibration of the material; in NMR the oscillation arisesfrom the nuclei of the atoms. In a magnetic field these precesslike tiny spinning tops. The rate at which they precess dependson the orientation of neighboring nuclei, spinning faster if theypoint in the same direction for example. By probing these neighboringinteractions with NMR, chemists are able to deduce the structureof molecules. MRI applies a magnetic field that varies over space,so that the oscillation frequency of a nucleus becomes a functionof its position.

The more I learned about NMR, the more puzzled I became. It usuallyis performed by putting a tube full of a liquid in a strong magneticfield. A coil wrapped around the tube sends in an oscillating magneticfield to make the nuclei precess, and then detects the resultingmagnetic field from the response of the nuclei. The chemists analyzeNMR by using the language of quantum mechanics. Quantum mechanicsdescribes the bizarre behavior of very small things, not a macroscopictest tube. Quantum mechanically there is nothing preventing thisbook from splitting into ten copies and leaping to the moon, butthe interactions with you and the rest of the world force it toremain sensibly classical. To understand the paradox of how an apparentlyclassical test tube can behave quantum mechanically, I started luggingaround a distinctly macroscopic NMR bible that chemists use, a bookby Ernst.

I was not getting anywhere until I was forced to do jury duty.For months I had been trying to avoid serving, but I ran out ofexcuses and had to clear a few days and report to the Cambridgecourthouse. There I was locked in the jury room for a day with nothingto do but read Ernst. Freed from all my usual distractions, I finallywas able to understand just how elegant the apparently impenetrableand self-referential chemists’ language is.

In NMR each molecule in the liquid behaves as an independent butidentical quantum system. The experiments treat the liquid as asingle effective molecule that in reality is represented by an enormousnumber of redundant copies. All of the classical interactions neededto hold and interact with the liquid do ruin some of those copies,but there are so many of them that the quantum mechanical propertiesof the rest can persist for many seconds, plenty of time to performan experiment. This means that NMR provides a convenient way toaccess a nonlinear quantum mechanical system.

Those attributes happen to be exactly what’s needed for the futureof high-performance computing. The search for cheap chips was leadingto something much more. We’ve come to expect that every year orso our computers will double in speed and capacity. This trend wasfirst noticed by Gordon Moore of Intel in 1965, and has continuedfor the last three decades. It’s not a law of nature; it’s an observationabout remarkable engineering progress. From the very beginning itwas clear where this was heading. Each year it appears that it willbe hard to continue improving performance, and each year some newtricks are found to squeeze out the next round. But some time around2020 or so everything will hit bottom. At the rate of current progress,the wires will be one atom wide, the memory cells will have oneelectron, and the fab plant will cost the GNP of the planet so thatno one can afford to build it anyway. Further improvements cannotcome from the current path of shrinking silicon circuits.

In part, this doesn’t matter because most of the current constraintson using a computer have nothing to do with the speed of the processor,they come from the user interface, or the power requirement, orthe software. A few of the billion dollars spent on building chipfabs could profitably be used elsewhere. But in part it does matter,because there are some hard problems that always demand faster computers.

Finding prime factors, for example, is the mathematical problemthat underpins the cryptography used for electronic commerce. Thedifficulty of factoring is what lets you send a credit card numberfrom a Web browser without worrying about an eavesdropper obtainingit. A desktop computer can multiply together the factors of a four-hundred-digitnumber in an instant, but finding them would take today’s fastestcomputer the age of the universe. That is because factoring is anexponential algorithm, meaning that doubling the size of the numberto be factored squares instead of doubles the time needed to doit. And beyond factoring, it’s also true that builders of each generationof computers have confidently predicted that they were sufficientfor most applications, only to find entirely new uses for new machinesthat weren’t anticipated by extrapolating the old ones. I don’tfeel limited because my desktop computer isn’t one thousand timesfaster, but that reflects a poverty of my imagination as much asan observation about what I think I need a computer for.

There aren’t many places to turn for alternatives to speed up computing.Light travels a foot in a billionth of a second, the cycle timeof today’s fastest chips. This means that they cannot be clockedmuch faster because there isn’t time for the signals to go anywhere.

Another option is to keep the clock speed but use more computersin parallel. Not only would this require an annual doubling of thenumber of computers to keep pace with Moore’s law, parallelism alonedoes not scale very well. Scientists have managed to perform logicalfunctions with DNA molecules, turning a tube full of DNA into anenormous number of computers, about 1023 of them. That’sa lot. Because they communicate with each other slowly, the easiestway to use them is like a lottery, preparing all possible answersto a problem and then checking to see which molecule has the winningsolution. But since the number of possible factors of a cryptographickey is exponential in the size of the key, this means that 1023computers could simultaneously check for the factors of just a twenty-three-digitnumber.

There’s only one thing in our universe that is exponentially big,but is not used for computing: quantum mechanics. A classical bitcan be either a 0 or a 1. A bit governed by the laws of quantummechanics can simultaneously be a 0 or a 1. To describe it you needto specify the probability that you’ll find a 0 if you measure it,and the probability that you’ll find a 1. The state of a classicalcomputer with N bits is specified by giving those bits, butfor a quantum computer it requires giving 2N coefficients,one for each possible combination of the bits. For just fifty quantumbits, 250 = 1015, storing more informationthan the largest classical computers. Because a quantum computercould perform all possible calculations at the same time, it mightbe the key to building more powerful ­computers.

This question was first asked in the early 1980s by Richard Feynmannat Caltech, Paul Benioff at Argonne National Laboratory, and DavidDeutsch at Oxford University. Feynmann was struck by the recognitionthat it takes a classical computer an exponential amount of timeto model a quantum system, because it has to keep track of all ofthe possible states of the system in parallel. He wondered if acomputer governed by the laws of quantum mechanics might be ableto efficiently model other quantum systems. Feynmann’s conjecture,recently supported by Seth Lloyd at MIT, had broader implicationsthan he realized.

The study of quantum computing remained a theoretical curiositythrough the 1980s, pursued by an unusual collection of people whohad to know a great deal about both physics and computer science,and also be slightly strange. Meetings of this community were adelight because the people involved were so unusual, and the discussionso irrelevant to any practical concerns. That began to change inthe early 1990s. A series of results were proved by David Deutsch,Richard Josza of Plymouth University, and Dan Simon, now at Microsoft,showing that a quantum computer is more powerful than a classicalcomputer for a series of increasingly less trivial problems. In1994 Peter Shor of AT&T was able to use these techniques toshow that a quantum computer could find prime factors in polynomialinstead of exponential time. This made factoring a four-hundred-digitnumber almost as easy as multiplying its factors.

These results drew on a second aspect of quantum mechanics thatis even more bizarre than the ability to be in many states simultaneously.If a quantum bit is in a superposition of 0 and 1, and it interactswith a second bit, the value of the second bit will depend on thestate of the first. If they’re now separated to opposite ends ofthe universe and the first bit is measured, it is forced to decidebetween being a 0 or a 1. At that instant the value of the secondbit is determined. This appears to be an instantaneous action ata distance, something that made Einstein very unhappy. It is calledentanglement, and it serves in effect to wire together the bitsin a quantum computer. Without entanglement a quantum computer wouldhave the same problem as a DNA computer, trying many answers atthe same time and then having to locate the correct one, like tryingto find a needle in a haystack. With entanglement, a single quantumcomputer can be certain to solve a factoring problem.

Naturally, the three-letter agencies (NSA, CIA, . . . ) panicked.Here was a very real threat to information security, coming froman entirely unexpected quarter. Since by that time the result wasalready widely known they couldn’t cover it up, but they could tryto keep ahead of the competition. So they started showing up atmeetings, in effect offering a purchase order to anyone who wouldbuild them a quantum computer. They started to relax when no oneserious would take one. In the next year many people realized justhow hard it was going to be to build a practical quantum computer.It would need to be carefully isolated from the rest of the worldso that it could stay quantum, but would need to be accessible tothe outside world so that the initial conditions could be loadedin, the computation run forward, and the answer read out. Papersstarted to appear saying that quantum computers could not be built.

Then, in 1996, two things happened to scare the three-letter agenciesagain. The first was the discovery by Peter Shor, Charles Bennettof IBM, and Andy Steane of Oxford, that quantum error correctionis possible. Just as classical computers add extra bits to memoriesto catch mistakes, they realized that quantum computers can useextra quantum bits to fix their errors. This means that an imperfectquantum computer could perform a perfect computation, as long asit is just good enough to be able to implement the error correction.

Here’s where NMR comes in. After my day of jury duty I understoodthat it might be possible to solve the paradox of connecting theclassical and quantum worlds with the chemist’s trick. Instead ofworking with a single computer, if a tube full of quantum computersin the form of a liquid sample in an NMR apparatus was used, thenall of the external interactions might ruin a few of them, but sinceeach effective quantum bit would be represented in an enormous numberof molecules it would retain its quantum coherence for a much longertime. For this idea to work it would have to use natural molecules,since it’s not feasible to individually fabricate that many computers.So the question then became whether a molecule could compute.

Classically, a digital computer must be able to flip a bit (theNOT gate), and perform a logical operation on two bits (such asthe AND gate). Any other logical operation can be assembled fromthese components. Similarly, a quantum computer must be able torotate a quantum bit to any angle between 0 and 1, and must be ableto perform a nonlinear operation between two bits. These are easyto do with NMR. The precession of a nucleus in a magnetic fieldis exactly the kind of rotation that is needed, and the interactionbetween the nuclei provides the nonlinearity.

I arrived for a visit at the Institute for Theoretical Physicsin Santa Barbara, armed with this knowledge of how to representquantum information in what appears to be a classical system, andhow to apply magnetic fields to manipulate the quantum information.On the first day I stopped by to see Isaac Chuang (now at IBM),and emerged from his office a week later. He had just finished workingout how to reduce a quantum computer to a series of equivalent operations.The operations that he needed and that I knew how to do matchedup perfectly.

By the end of the week we had worked out the details of programminga molecular quantum computer. We had an eerie feeling that we weren’treally discovering anything, just lifting a curtain to reveal abeautiful structure that had been there all along waiting for someoneto notice it.

A liquid is in fact that most perfect system that could be builtfor quantum computing. It’s easy to obtain in huge quantities, themolecules all come perfectly assembled without any manufacturingimperfections, and the cloud of electrons around the nucleus andthe rapid tumbling of the molecules in the liquid protect the quantumcoherence of the nuclei. Once we understood the connection betweenliquid NMR and quantum computing we didn’t even need to do a demonstrationexperiment, because the chemists had been doing them for years.Their molecular studies were showing all of the working elementsof a quantum computer, without their realizing that’s what theywere doing.

Richard Feynmann gave a seminal talk in 1959 entitled “There’sPlenty of Room at the Bottom.” He foresaw the problems coming inshrinking computers, and wondered if we could jump to making molecularcomputers. This inspired the field of nanotechnology that soughtto do just that. Unfortunately, the study of nanotechnology hasproduced thoughtful analyses and compelling visions, but very littlein the way of working machines. It now turns out that ordinary moleculesall along have known how to compute—we just weren’t askingthem the right questions.

This was one of those ideas that was ready to be thought. An MIT/Harvardgroup found a related way to manipulate quantum information in aliquid, one at Los Alamos found still another, and many groups aroundthe world are now using these techniques to make small quantum computers.The first experimental computation we did that required fewer stepsthan on a classical computer was a search problem, using the moleculechloroform. Given no advance knowledge, unlocking a classical padlockwith N settings requires about N/2 attempts. Lov Groverat Bell Labs has shown that this can be done in ÷N stepson a quantum computer. Our chloroform computer had four states;our quantum search was able to find in a single step the answerthat took a few tries classically.

Daunting challenges remain if these demonstration experiments areever to grow up to exceed the performance of today’s fastest classicalcomputers, although these are increasingly technological ratherthan conceptual problems. The most severe limit is a rapid decreasein sensitivity for larger molecules, but this might be solved byusing lasers to align the nuclei. The quantum coherence lasts forthousands of operations, approaching the limit needed to be ableto use quantum error correction. This also needs a technique suchas the optical nuclear alignment, to prepare the extra quantum bitsused in the error correction. Finally, as the size of a moleculeis increased, the interactions between nuclei at the far ends ofa molecule become too weak to use for computing, but Seth Lloydhas shown that a polymer with a simple repeating atomic patternis almost as useful and requires that the nuclei interact only withtheir nearest neighbors.

These scaling constraints offer a glimpse of an ideal moleculefor quantum computing. It should have a regular structure, so thatSeth’s technique can be used. Better still, the pattern should be2D or 3D rather than a linear chain, to reduce the time to passmessages around the molecule. It should have a rigid configuration,so that all of the local atomic environments are identical, andit should have a special structure at the end that can be used toload in data and read out results. This sounds disturbingly likea microtubule.

Microtubules are a kind of molecular scaffolding in neurons. Whilethere’s no compelling reason to believe that quantum mechanics hasanything to do with consciousness, those who do think that it doeshave come to identify microtubules as the seat of quantum consciousness.I couldn’t begin to guess how the brain could perform the kindsof molecular manipulations that we do in NMR, although it is amusingto observe that a head in an MRI magnet is a perfectly good mediumin which to perform a quantum computation.

The brain may not be the best place to look for guidance in howto build a quantum computer, but it does have a valuable lessonto teach us about how to build more powerful computers. Moore’slaw demands exponential improvements in performance. Quantum mechanicspromises one way to do that; analog circuitry offers another.

The neurons in a brain are about a million times slower than thetransistors in a computer, yet they can instantly perform tasksthat tax the fastest computers. There are many more neurons in thebrain than transistors in a computer, about 1011 of them, and evenmore connections among them, roughly 1015 synapses. These synapsesare the key.

Many computing problems reduce to finding good ways to approximatecomplicated functions. For example, for a computer to recognizefaces it must have a function that can take as an input the pixelsin an image and provide as an output the name associated with theface. There are many techniques for doing this, but they all reducein the end to some kind of function that relates inputs to outputs.In traditional engineering practice unknown functions are representedby using some family of basis functions, such as a collection ofpolynomial powers or trigonometric functions, combined with a setof weighting coefficients. There are relatively straightforwardtechniques for finding the best coefficients to match a given setof data.

This isn’t how the brain does it. In the brain the coefficientscorrespond to the synaptic connection strengths, and the basis functionsare the S-shaped response of the neurons. The brain uses the coefficientsin forming the arguments of the basis functions, rather than usingthem just to combine the results. Engineers have historically stayedaway from doing this because it can be shown that there’s no longera practical procedure to find the best values for the coefficients.All that can be done is some kind of search that learns reasonableones.

The study of neural networks was inspired by the goal of emulatingthe organization of the brain. While the brain is still far toocomplex to model in any detail, simple mathematical networks thatput the unknown coefficients before the nonlinear basis functionswere surprisingly successful. It is now understood why.

A characteristic feature of difficult recognition problems is thatthere may be a huge number of inputs, such as all of the pixelsin an image. In a traditional model, one coefficient is needed foreach possible combination of inputs and basis functions. This rapidlyexplodes, so that if ten are needed for one input, one hundred areneeded for two, one thousand for three, and so forth. This is calledthe “curse of dimensionality.” A neural network model is much moreflexible. The adjustable coefficients inside the nonlinear functionsare far more powerful; the number required is a polynomial functioninstead of an exponential function of the number of inputs.

Even so, there’s still the problem of finding their values. A secondresult comes to the rescue here. It turns out that these modelsare so flexible, almost anywhere they start out there is a goodsolution nearby. A search procedure does not need to find the singlebest solution, just an acceptable one, and there is a huge numberof them. This is ensured by using a large number of neurons, farmore than would be considered sane in a conventional model. Thisproperty has come to be called the “blessing of dimensionality.”It helps explain why learning is so essential to how the brain functions.

So here is the second exponential insight: use models that requirelearning the values of coefficients that appear inside nonlinearfunctions. While this is not applicable to all computational problems,it works for just the sorts of things that people do well, suchas recognizing patterns and prioritizing tasks. A digital computercan simulate this kind of architecture, but it’s a misplaced applicationof the digital precision. Simpler analog circuits can do the samething with far fewer parts.

Analog circuits were used in the early days of computing, but theyfell out of favor because they accumulate errors. Digital computerscan correct their mistakes and hence perform arbitrarily long computations.However, this is not an essential distinction between analog anddigital systems. Ultimately everything is analog; the apparentlydigital values are a consequence of carefully designed analog circuits.We’re now learning how to marry the most desirable features of bothworlds, correcting analog errors while using the analog flexibilityto squeeze much more performance out of a simple physical system.

For example, Geoff Grinstein of IBM and I studied what are calledspread-spectrum coding techniques, which are used to intentionallymake a signal appear to be more random than it is. While this mayappear to be a perverse aim, it turns out that almost every propertyof engineered systems is improved when they work with effectivelyrandom signals (more people can share a communications channel withless interference, more data can be stored on a disk, and so forth).There is a beautiful theory for how to generate bits for use inspread spectrum that appear to be completely random but in factare entirely predictable. A transmitter uses such a sequence tospread its signal and the receiver then uses an identical copy torecover the message (this idea can be traced back to the actressHedy Lamarr and composer George Antheil using piano rolls in WorldWar II). This is what you hear when your modem hisses. The catchis that if the receiver is remote from the transmitter it has adifficult computing job to do to figure out the spreading sequencethat the transmitter used. This is part of what a GPS receiver mustdo to lock onto a satellite and find its position.

Two copies of almost any analog system, if allowed to interactin almost any way, have a very interesting property: they synchronize.This used to be seen in the simultaneous ticking of mechanical clockson the wall of a clock shop. Geoff and I wondered if we could usethis property to solve the problem of spread-spectrum acquisition.Much to our pleasant surprise, we discovered that we could designsimple continuous systems that exactly matched the response of thefamiliar digital ones for digital signals, but if they started inthe wrong state they could use their analog freedom to synchronizeonto the digital signal. Instead of splitting the job into an analogdetector and then a digital search program as is done now, a physicalanalog part can do everything.

The digital world has gone on refining Turing’s machine, when evenTuring understood its limits. One more area that he made a seminalcontribution to was explaining how biology can form patterns byusing chemical reactions. This useful behavior is easy to find innature but hard to model on a digital computer. Now that we’re approachingthe end of the era of rapid increases in the performance of digitalcomputers, like Turing we have no choice but to pay more attentionto information processing in the analog world.

When Feynmann said that there’s plenty of room at the bottom, hemeant that there was a long way to go to shrink computers down toatomic sizes. That’s no longer true. With difficulty we are approachingthat limit with carefully engineered circuits, and can now glimpsehow to eliminate all that effort by using natural materials. Althoughhe wasn’t thinking about economics, that may carry the most interestingimplication of all of his challenge. There’s still plenty of roomat the bottom to make much cheaper and more widely accessible computers.Nature is not only good at designing computers, it has a few lessonsleft to teach us about manufacturing them.

WHEN THINGS START TO THINK by Neil Gershenfeld. ©1998 byNeil A. Gershenfeld. Reprinted by arrangement with Henry Holt andCompany, LLC.