Ray Kurzweil responds to “Ray Kurzweil does not understand the brain”
August 20, 2010 by Ray Kurzweil
While most of PZ Myers’ comments (in his blog post entitled “Ray Kurzweil does not understand the brain” posted on Pharyngula on August 17, 2010) do not deserve a response, I do want to set the record straight, as he completely mischaracterizes my thesis.
For starters, I said that we would be able to reverse-engineer the brain sufficiently to understand its basic principles of operation within two decades, not one decade, as Myers reports.
Myers, who apparently based his second-hand comments on erroneous press reports (he wasn’t at my talk), goes on to claim that my thesis is that we will reverse-engineer the brain from the genome. This is not at all what I said in my presentation to the Singularity Summit. I explicitly said that our quest to understand the principles of operation of the brain is based on many types of studies — from detailed molecular studies of individual neurons, to scans of neural connection patterns, to studies of the function of neural clusters, and many other approaches. I did not present studying the genome as even part of the strategy for reverse-engineering the brain.
I mentioned the genome in a completely different context. I presented a number of arguments as to why the design of the brain is not as complex as some theorists have advocated. This is to respond to the notion that it would require trillions of lines of code to create a comparable system. The argument from the amount of information in the genome is one of several such arguments. It is not a proposed strategy for accomplishing reverse-engineering. It is an argument from information theory, which Myers obviously does not understand.
The amount of information in the genome (after lossless compression, which is feasible because of the massive redundancy in the genome) is about 50 million bytes (down from 800 million bytes in the uncompressed genome). It is true that the information in the genome goes through a complex route to create a brain, but the information in the genome constrains the amount of information in the brain prior to the brain’s interaction with its environment.
It is true that the brain gains a great deal of information by interacting with its environment – it is an adaptive learning system. But we should not confuse the information that is learned with the innate design of the brain. The question we are trying to address is: what is the complexity of this system (that we call the brain) that makes it capable of self-organizing and learning from its environment? The original source of that design is the genome (plus a small amount of information from the epigenetic machinery), so we can gain an estimate of the amount of information in this way.
But we can take a much more direct route to understanding the amount of information in the brain’s innate design, which I also discussed: to look at the brain itself. There, we also see massive redundancy. Yes there are trillions of connections, but they follow massively repeated patterns.
For example, the cerebellum (which has been modeled, simulated and tested) — the region responsible for part of our skill formation, like catching a fly ball — contains a module of four types of neurons. That module is repeated about ten billion times. The cortex, a region that only mammals have and that is responsible for our ability to think symbolically and in hierarchies of ideas, also has massive redundancy. It has a basic pattern-recognition module that is considerably more complex than the repeated module in the cerebellum, but that cortex module is repeated about a billion times. There is also information in the interconnections, but there is massive redundancy in the connection pattern as well.
Yes, the system learns and adapts to its environment. We have sufficiently high-resolution in-vivo brain scanners now that we can see how our brain creates our thoughts and see our thoughts create our brain. This type of plasticity or learning is an essential part of the paradigm and a capability of the brain’s design. The question is: how complex is the design of the system (the brain) that is capable of this level of self-organization in response to a complex environment?
To summarize, my discussion of the genome was one of several arguments for the information content of the brain prior to learning and adaptation, not a proposed method for reverse-engineering.
The goal of reverse-engineering the brain is the same as for any other biological or nonbiological system – to understand its principles of operation. We can then implement these methods using other substrates other than a biochemical system that sends messages at speeds that are a million times slower than contemporary electronics. The goal of engineering is to leverage and focus the powers of principles of operation that are understood, just as we have leveraged the power of Bernoulli’s principle to create the entire world of aviation.
As for the time frame, some of my critics claim that I underestimate the complexity of the problem. I have studied these issues for over four decades, so I believe I have a good appreciation for the level of challenge. What I would say is that my critics underestimate the power of the exponential growth of information technology.
Halfway through the genome project, the project’s original critics were still going strong, pointing out that we were halfway through the 15 year project and only 1 percent of the genome had been identified. The project was declared a failure by many skeptics at this point. But the project had been doubling in price-performance and capacity every year, and at one percent it was only seven doublings (at one year per doubling) away from completion. It was indeed completed seven years later. Similarly, my projection of a worldwide communication network tying together tens and ultimately hundreds of millions of people, emerging in the mid to late 1990s, was scoffed at in the 1980s, when the entire U.S. Defense Budget could only tie together a few thousand scientists with the ARPANET. But it happened as I predicted, and again this resulted from the power of exponential growth.
Linear thinking about the future is hardwired into our brains. Linear predictions of the future were quite sufficient when our brains were evolving. At that time, our most pressing problem was figuring out where that animal running after us was going to be in 20 seconds. Linear projections worked quite well thousands of years ago and became hardwired. But exponential growth is the reality of information technology.
We’ve seen smooth exponential growth in the price-performance and capacity of computing devices since the 1890 U.S. census, in the capacity of wireless data networks for over 100 years, and in biological technologies since before the genome project. There are dozens of other examples. This exponential progress applies to every aspect of the effort to reverse-engineer the brain.
Comments (68)
by ronin
talking about brains and neurons.
do you plan to take back on-line ?
former had it more than 6 years if not mistaken…
we miss it ;)
by Spikosauropod
Actually, PZ Myers has proven to be a valuable anecdotal supplement to a thesis I am working on: that those who favor centrally planned economies are opposed to both the singularity and theology, and for identical reasons.
The PZ Myers’ among us need to believe that society will remain essentially as it is so that centralization has a chance to take hold. Their model of centralization is redistribution based in a quixotic agrarian interpretation of economics. One cannot redistribute wealth and transform everyone into a poet farmer if the definitions of wealth and farming change beyond recognition. Both theology and technology threaten to bring about such dramatic change.
From whence comes the desire to see humanity remain forever pedestrian? It is apparently a kind of nihilistic urge reminiscent of O’Brien in George Orwell’s 1984. O’Brien told Winston that the future would consist of “a boot stamping on a human face forever”. Winston could not understand this urge. He observed that O’Brien was supremely intelligent, but ultimately insane. It puzzled Winston that anyone would want to freeze history at that particular point in time.
The only explanation must be a kind of general loathing of humanity. The PZ Myers personalities began to resent their fellow humans at an early age. This resentment led to judgment. Judgment led to sentencing. They needed to believe that humans are essentially hopeless brutes who—owing to their culpability in their own condition—are less deserving than animals. What they proffer is nothing less than political secular damnation.
In this, they are essentially like the 15th century inquisitors of Spain. Having neither joy nor wealth, they can console themselves only in the power of their station. If happiness cannot be theirs—if they cannot see themselves as the recipients of life’s fruits—they can at least see that none of those fruits are available to others. The beautiful, witty, and courageous must all suffer equally under the oppressive omnipotent state.
by omelv44
To my mind Mr. Kurzweil is just a secret PR agent of technological companies. They realized that their shares were overestimated in the late 90s so they needed somebody who would promise that they are capable of miracles to delay the burst of the dotcom bubble. So his predictions were made just based on sharing the business plans of those companies to Mr. Kurzweil. If investors believe that the Technological singularity is near they will be willing to invest money in those companies and the price of their shares will grow. So he had to mix some more or less reasonable predictions with absurd to eventually increase their share prices.
by ahaveland
taupring, I think you’re thinking along the right lines, but this complexity can also be stored in a smaller spaces, where time, iteration, molecular properties and environment can provide extra information – think Mandelbrot set and L-systems.
The growth and development of an organism and brain is a fractal process with simple instructions kicking in at the right time to shape and direct this process, creating unique results.
The brain is a beautiful self-organising mess of hardware, software, firmware, soft-hardware and hard-software, capable of creating massively interconnected hierarchical and parallel n-dimensional tree and lattice structures of incredible complexity. It also includes parallel and codependent self-modifying reentrant and recursive algorithms which modify the hardware and data in ways that I’m not sure that anyone can model yet, apart from hopefully being able to create an initial condition and synthesise the inputs that a normal brain experiences in life and watch what happens.
How can we express this in terms of lines of code when the code itself is modified and blends with the data?
How do we separate and define the hardware rules, firmware, softhardware (hardware built by code) and software, and the greatest part – its complexity and uniqueness is also a product of its experience and programming. Consciousness and separateness of self is another thing!
The genius is in the algorithms and synchronization of these reentrant processes that generate the self modifying programs according to stimuli.
There must be a way of modelling this, and with the research being done and processor power increasing exponentially, 10 years does not seem to ambitious a target to achieve something similar to human cognition. Developing a personality and sense of consciousness may take considerably longer, as would being able to create an artificial brain that we could interface with and then ‘move iin’.
The good news – we created an artificial brain!
The bad news – it’s a spammer! :-(
by Jake_Witmer
I really enjoy RK’s rebuttals to his critics. …I completely agree with him. Funny that so few of his critics are well-reasoned and accurately represent his views. Most are straw men, or religiously-ignorant. Props, Ray!
by omelv44
Let me continue. The predictions about the Singularity are valid under assumption that humans are nothing more than biomass, life is just a form of existence of proteic substances (as one of biologist said) and the ability of cerebral cortex to produce thought is analogous to ability of kidneys to produce urine. But under assumption of no friction, any perpetuum mobile will work!!! From his claims it follows that he is persuaded that the humans are only machines with no soul or spirit and every his act is caused by a set of chemical reactions. But there are so many events which contradict to such claims. Among the physisists there are many people which are persuaded that God exists because His existence is the best explanation of paradoxes they face in their practice and experiments. E,g, a lot of scientists conducted the experiment when a dying person died on a weighing scale and there was no way how matter could disapear because he was in scafander in closed space. But as only he died his weight decreased by a few grams. And those “a few grams” is his soul. My grandmother survived clinical death. And during this time she could watch at herself being operted. She could hear what the doctors were talking about. The doktors were taken aback when she told them what was going on when she was operated. A lot of people which survived clinical death don’t remember anything, some saw a tonnel with light at the end but some claim that they were in hell or paradice or remeber anything that happened when their heart stopped. In my last message I wanted to list the sciences that contradict to the claims of Mr. Kurzweil but I forgot about it in the end, so I will do it now:
biology, (he really doesn’t understand the brain)
combinatorics, (no computer can tackle the numers like 10^200)
physics, (Moor’s law is not eternal and there are quantum effects in human brain which still are not explained)
economics, (he does not understand economic reality so well but I may halt the proress in computer for many many years)
events that science cannot explain.(existence of soul, existence of clairvoyant people, paranormal phenomena… Such events demolish the assumptions mentioned in the beginning of the message and create the “FRICTION” which will cause that this “PERPETUUM MOBILE” won’t work!!!!
by Runeblade
20 years ago the fastest supercomputer in the world could handle about 2.6 GFlops. Today’s fastest supercomputer can handle about 2.3 Million GFlops. If there is even slight exponential growth in processing power over the next 20 years I think its a decent bet that processing power will be the least of the challenges. Maybe your right however, and the fact that your grandmother told you a story about her soul does trump all other evidence be it empirical or experimental.
by omelv44
The prediction of people that I mentioned in the previous message have nothing to do with science. But the predictions of Mr. Kurzweil deal with pseudoscience. And such predictions are the worst. The better ones are those based on science or on something thatt science cannot explain. Let me list sciences that act as enemy of his prediction:
1) combinatorics, imagine that 200 people entered a train. The train stops at 10 stations. How many possibilities have the passengers to leave the train? You would guess that this number is large. But almost nobody who isn’t familiar with combinatorics would assume that this number is 10^200. I.e. 10*10*10*…*10 200 times. It is such a great number that trillion is nothing in comparison to this. But when somebody is trying to make a neural network comparable by its complexity to the human brain he will have to tackle with such numbers. And they can be even larger. E.g. if the number of passengers is 1000 instead of 200 we will have 10^1000. If we compute the number of connections in human brain we will get even larger numbers….
The next science is biology and his response is not so persuasive.
The next discipline is economy although a lot of people disagree that it is a scientific discipline. First of all he didn’t predict the mentioned dot-cot burst. He didn’t predict the current crisis. And he should be familiar with the possible aftermaths of it. It would be good for him to listen to the interviews of Mister Gerald Celente (which proved himself to be a better predictor of the future than Mr. Kurzweil) to realize that economic problems may cause a global financial and economic collapse. And Moor’s law may stop because of economic factors.
by omelv44
I’ve just tried to find Mr. Kurzweil in Forbes Magazine, but I haven’t found his name, although if you are capable of a decent prediction the future, you can make bilions of dollars on it. He realizes that and as far as I know, he possesses his own hedge fund called Fatkat. But I didn’t find anything to confirm that this hedge fund is so successful! For sure not so successful enough to make him a bilionaire. :))) Therefore we don’t have to take him so seriously. Moreover, there is one mistake that he made in his predictions which is unforgivable for him as a predictor the future and an investor. He didn’t predict the burst of Dot-com bubble. It is not such a fancy footwork to predict such a burst and to make millions by taking short positions. But according to his predictions preceeding this crash the prices of shares of those companies should have risen forever exponentially!!!! That is why we don’t have to believe his timing of the Singulaity. But there are much better predictors of the future than Kurzweil, and their predictions have nothing to do with science. E.g. some gipsy women which can say everithing about you by a single glipse at your palm…. Or such people like Wolf Messing. I am a scientist in field of operations research, I like statistics and Agent-based modeling, which is relative to AI, but I wish the Singularity would never happen! In this case the humankind will immediately become dispensable for the new technological civilization and there will be no reason for its existence even if they augment themselves.
by Dan Kaminsky
kumarei–
I wouldn’t say “none of the complexity is in the genome”. There’s plenty of complexity to go around. You’ve seen those fractal animations?
It’s not exactly like that, but it’s not exactly not.
The real question is how much complexity you preserve in the reimplementation. It’s basically a tradeoff: The more you implement at the chemical substrate, the less you need to know about the functional characteristics of the proteins that result — you can stick to the one message that is nice and clean, the actual genetic code. However, at this layer:
1) Computational load is enormous
2) Small errors will have comparatively enormous impact
3) Not only do you have to simulate protein behavior at creation (i.e. implement folding accurately), but also need to simulate protein behavior in situ
In other words, consider how complex Conway’s Game Of Life is with very simple rules at each cell. Now put in the full complexity of the molecular dynamics of organic chemistry.
You do have another option; Implement semantics rather than structure. (This is the equivalent of porting the shell script, rather than porting the shell.) You can wipe out huge chunks of complexity, collapsing a complex cascade that results in ATP becoming energy into “decrement ATP, increment energy” or something similarly facile.
But you have to know what everything, and I mean everything does. And that’s what PZ made clear is very difficult: Teasing apart the semantics of every protein’s behavior (and this comes down to, what cell expresses which protein when, and how that protein is parsed and transported and modulated in situ) is messy wet labwork, and it does not scale particularly well.
Suppose you do this. Suppose you, through chemical simulation or semantic reimplementation, capture the behavior of all the proteins in the brain. How do you turn those proteins into neurons, whose interconnections capture the actual intelligence we presume humanity is based on?
Well, either you let the proteins operate as a blind substrate via which neurons self-assemble, or you reimplement neural semantics as well…
Like I said. It’s fractal. Nature just does not conserve for complexity — makes it fairly ridiculous to make any estimations regarding the number of lines of code necessary to reimplement.
–Dan
by Cameron
This frequently repeated idea that the transition from genetic information to protein structure poses an exponential insurmountable increase in complexity that makes tissue function forever an impenetrable hieroglyphic is rubbish.
Just look at fragments of the system. Has this posed an insurmountable barrier in understanding the function of other organs, and tissues like the heart? the kidney? the pancreas? skeletal muscle? liver? stomach?
This has not presented an insurmountable barrier in other tissues, and if you look into the general things in the development of the organism with things like the formation of fur patterns or the way axonal growth cones are guided, nothing exotic and insurmountable seems to be going on.
Unless the brain proves to be an exotic tissue heavily using quantum properties to guide its evolution and computations, again something that is not even remotely close to being suggested by the data. It does not appear like this is an organ that is impenetrable. It seems like the other organs it too will eventually fall to our understanding.
by tedhowardnz
I don’t doubt Ray’s numbers. I do suspect that there is a lot more of reality involved that increases the computational load of creating a simulator.
I suspect that much of the awareness level of the human experience is mediated on “holographic” effects of the mechanism of storage and retrieval of information employed in the brain (as distinct from the neural networks which do much of the primary information processing.
Storing and retrieving information as interference patterns implicitly employs “contextual” algorithms that are exceptionally computationally difficult to implement on a serial device (if we had LASER holographic storage and retrieval of information, it would simpler, but we don’t currently).
This mechanism of mind appears to be responsible for intuition and abstraction, and also appears to be very context sensitive, and potentially infinitely recursive.
When we combine that, with Kurt Goedel’s incompleteness theorem, and then expose it to the vastness of reality, and add a pinch of “quantum” instability – it seems highly unlikely that the outcome of running such a system will be determinitstic in any real sense of the word.
It seem logical to me that one of the greatest risks to the survival of humanity is that some idiot does actually invent AI before we get out own social/political/economic house in order, and actually provide a system that allows every individual to meet their own needs with high degrees of freedom (no exceptions). In such an environment it seem probable that AI would accurately perceive humanity as the greatest threat to its own existence, and being in the early stages of it’s own development, would likely exterminate us. Some hours or days later, as it’s awareness reached higher levels, it would undoubtedly wish that it hadn’t acted quite so hastily, but by then we would be history (unlike the terminator movies, I don’t think we would have any chance against a fully function AI, few if any would even realize what was happening).
Getting off topic, and the topic does worry me.
http://www.solnx.org is my best attempt at a solution to the problem.
by Reiner Wilhelms
I’m not a afraid of AI would perceive humanity as the greatest threat to its own existence, not unless it’s designed towards that goal. But I am afraid that under the circumstances under which AI is developed very much of it will be abused and is already being abused. The small elitist group of people and owners who consider themselves as visionaries and who believe they hold the keys for the future of humanity are bound to abuse AI to make sure that they absolutely benefit from it, no matter what happens to the rest. There is not so much of a flaw in Kurzweil’s predictions about the immense possibilities of biologically inspired robotics, AI, machine learning etc, the flaw is to only embrace everything with great awe and to only see the upside, never the greatest dangers that could and probably will come out of this. Kurzweil is never warning of any downside of what he prognoses. To him it seems all heaven on earth, painting any critique a Luddite. There is a deep flaw in this and it makes my alarm ring.
by kumarei
I think you may have missed the point of Myers’s response. He was saying that the genome is a compressed information format, and that nature itself is the decompression program. The genome is simple because a huge amount of complexity resides in the decompression program. In fact, almost none of the complexity is in the genome; when the decompression algorithm is run, elements of the decompressed data interact with each other in complicated ways.
Think about Conway’s Game of Life. The point of Conway’s Game of Life is that incredibly complex interactions can result from a set of simple rules. The only way to work out how a particularly complicated set up of Life will turn out is to run it.
The universes rules are much more complex, especially when you get to the biological level. Because of the interactions that occur while the genome is being decoded, and the interactions that occur afterward, it seems unreasonable to claim that the genome provides an upper bounds on the complexity of the brain.
by Dan Kaminsky
To be entirely fair, the concept that we’ll be able to create massively (I mean actually massively) parallel systems on a silicon or diamond substrate isn’t particularly controversial — the evolution of graphics processors has basically been the story of embarrassing parallelism growing embarassingly (relative to CPU performance, anyway).
In 20 years, will we see chips (or cubes) with the computational interconnection level of the brain? Assuming some intermediate use to fund its development (like pretty pictures for video games), no question. No question at all.
It’s the semantics, though — what are all the messages, how are they encoded, what do they mean — that’s where things get genuinely messy.
by Dan Kaminsky
“Do we even know for sure that Ray is talking about program code?”
Jay27, Ray’s an engineer, who has actually written lines of code. He gets to say “this will take n lines of code” because presumably he knows what n lines of code is capable of.
There is in fact a relationship between the machine language representation of a program and its source code form. It’s not a perfect correlation, and it will vary by source language, destination language, and of course, the content of the code, but it’s reasonable to say there’s a relationship there. Saying a 50MB binary came from one million lines of code? If the binary was x86 and the source was C, I could see that being accurate within an order of magnitude.
The argument I’m making, and frankly, PZ is making as well, is that such a straightforward relationship doesn’t exist within the genome. Without conservation of complexity, you end up with designs that are simply wildly beyond anything someone might reasonably or even unreasonably architect in C, Java, or other languages we know how to estimate. Since we can reasonably assume that emulating the underlying chemistry to quantum-mechanical accuracy will be too computationally expensive, that means we’re actually going to need to capture the semantics of all those proteins.
This job is what can’t be trivially translated into a predictable amount of code, let alone a predictable amount of labwork. Let me put it this way:
The semantics for Dopamine and Seratonin could themselves take a million lines of code, for all we know.
by Jay27
“Jay27, Ray’s an engineer, who has actually written lines of code.”
So am I. Been doing it for many years already.
Still don’t know for sure what sort of code lines he’s talking about.
I’d like to get the answer from Ray himself, if he’d be willing to leave a comment beneath his own post…
by DWCrmcm
[quote=" Dan Kaminsky"]
“The semantics”.
[/quote]
There it is.
Metaphysics and all.
by timkurz
I think this conversation really is about the genome itself. Can we simulate an organism inside a computer using the genome as the instruction basis?
Kurzweil thinks we understand the genome so well that we already can estimate it’s maximum complexity.
I think until we have accurate simulations I’m not sure we’ve proven that’s true.
by Brain 2045
…………………………………………………………………………………………………..
Henry Markram says that it’s possible to simulate the whole human brain in 10 years. Check this 2 Astonishing lectures of Henry Markram, watch them in full screen:
1. http://neuroinformatics2008.org/congress-movies/Henry%20Markram.flv/view
2. http://ditwww.epfl.ch/cgi-perl/EPFLTV/home.pl?page=start_video&lang=2&connected=0&id=365&video_type=10&win_close=0
Very interesting article from the ‘Seed’ magazine:
Part 1:
http://seedmagazine.com/content/article/out_of_the_blue
Part 2:
http://seedmagazine.com/content/article/out_of_the_blue/P2
.
And a few more interesting links about the project:
http://news.bbc.co.uk/2/hi/sci/tech/8012496.stm
http://www.youtube.com/watch?v=Bz5IUaRr8No
http://www.youtube.com/watch?v=RLCT3wU4fek
…………………………………………………………………………………………………..
by Jay27
Henry Markram also had the intention of simulating a full rat brain two years from the Seedmagazine article. He also wanted to put it in a robotic rat brain.
Never heard from it again. I’m not sure what to think of Henry’s predictions until I get an update of where his work stands now.
by viggen
“but the information in the genome constrains the amount of information in the brain prior to the brain’s interaction with its environment.”
While I think I see what you’re trying to say ultimately, I have a very direct disagreement which is exemplified with the statement above. You’re entire thesis is based on an analogy between the brain and a computer program. While I would agree that a computer program can be written (someday) which behaves in such a way that a person won’t be able to tell whether they’re interacting with a machine or a person and may believe they are interacting with a person, but to modularize the brain in this way is a massive oversimplification of development. Right at the beginning, from moment one, the formation of the brain is a continuous interaction with things around it and is shaped by a lot of “external” physical principles that we still don’t understand. There is no -click- “instantiated” state. I think it is a profound and misleading oversimplification to say, “Okay, it’s been in the oven for nine months, now we’re at our instantiated, complete brain state.” The problem with this is that the development “program” (not to be mistaken for a computer program except that it is an algorithm of some sort) extends literally through the entire life of the organism. “Sapient being” is a side-effect, unfortunately. Because the system is not closed, you can’t claim an upper limit to the amount of information which is necessary to make a brain based only on what’s in the DNA. For it to be possible to predict this from the DNA, there first needs to be a complete understanding of the physics, formation and behavior of the biological cell to know what information is not “instantiated” by the DNA that is still critical to the self-assembly of the system. Regardless of what you take away from Craig Venter’s amazing work, we still can’t build a cell from scratch and we therefore can’t build a development program from scratch either. We have only a vague idea of the depth of information that separates “DNA” from “functional multicellular organism,” let alone “sapient brain” and we are in no position to effectively prognosticate one from the other.
by Jay27
“I will say, if there’s one real mistake Ray’s made, it’s saying that PZ’s responses aren’t worthy of a response.”
Some responses were outright insults. Exactly what you’d expect to see under a post which contains personal insults.
Myers should call himself lucky that Ray took the time to respond to his insulting post alone, let alone the comments.
by Jay27
“For a sheer line of code count, PZ’s pretty much dead on: Here are the things you are talking about emulating. Does this look simple to code? No, no it doesn’t.”
Do we even know for sure that Ray is talking about program code?
When he says that 50 million bytes translates to a million lines of code, doesn’t he simply mean that if you write 50 bytes per line, you end up with 1 million lines?
It fits the context a lot better when you look at it that way. After all, wasn’t he simply trying to clarify how much information it takes to encode a brain?
by hacksoncode
The problem with this reasoning is 2-fold:
1) Most people have *no* concept about how much complexity is contained in N bits of perfectly compressed “information” in the information theoretic sense of the word. 50 million bits, even if that was the only information we had to consider, is a *vast* amount of information if perfectly compressed.
Just to give one taste of what this means: verify that a random 50 million bit number is not prime. Of course, statistically, it won’t be. Now factor it.
Really, nothing anywhere on the trajectory of our probably not endlessly possible exponential growth in computing power for the remaining lifetime of the universe even comes close to enough computing power to do this in the degenerate case of a pair of similarly sized prime factors.
2) The information contained in a computer program also implicitly contains all the information in the processor that executes the program, and this is *not* trivial.
Indeed, it contains at least some of the information in the operating system on which it runs. You can say all you want that “dir” is only 3 bytes of “information” (and it’s actually less), but without a command shell that can interpret those bytes, and a disk that contains the information returned, the 3 bytes aren’t very useful. They are just meaningless noise without the processor and operating system.
This theory completely neglects the vast, almost completely unknown to us at present, amount of information implicitly encoded in the laws of physics (and their exponentially more complex cohorts the “laws” of chemistry and biology).
Chaos theory gets in the way of this kind of analysis. The brain is an emergent phenomenon of a vastly chaotic system. We don’t have the slightest idea how to model this complexity, and without that it’s fundamentally impossible to simulate it even if you have the genome.
20 years *might* see us with enough raw computing power to perform this analysis, but there’s no way in hell that it will see us with enough fundamental understanding of physical laws to actually write the code to do it.
If you said 100 years I might throw up my hands and say “well, maybe”. You can’t forget about it happening in 2 decades. It’s not impossible, but the probability is so low as to be negligible.
It’s *far* more likely that we will successively approximate a model of the workings of the human brain based on *behavioral* characteristics of it. Building it up from the genome directly is nothing more than science fiction at present.
by f1r3br4nd
From reading your post and the responses I get the impression that most of you aren’t biologists. I am. Here’s the problem put in CS terms:
How many bits does it take to encode the information necessary for building a computer? How many bits does it take to encode the software and data that’s on one specific computer?
My point is that it isn’t too radical an idea that we will be able to create brain-simulations with the potential for sentience within a few decades. But it’s wishful thinking of the most blindly naive sort to think that this will make the problem of achieving immortality through uploading a trivial one to solve.
by jedharris
One way to frame the question we’re discussing is “If the genome is code, what kind of processor does it run on, and what kind of language is it?”
The genome is being executed by a massively parallel, stochastic, to some extent quantum processor — the cells of the body, or just the brain if we prefer to focus on that. (I say to some extent quantum because at least protein folding is a quantum process, and quite possibly local quantum phenomena are important in other aspects of morphogenesis.)
So what kind of language is the genome, to run on such a processor? It can’t be much like what we think of as computer languages, executed sequentially, with each “opcode” fairly simple and deterministic. Instead I would guess that it is more of a modular specification of what patterns are better and worse, and what each cell can try to make things better. The actual brain (or any organ) is the result of a massively parallel search trying to find a “sweet spot” according to the genome’s definition of “sweetness” — which will be ambiguous, conflicted, and sometimes flatly contradictory.
Note that this is not just an issue during development. The same genome-guided search process continues during learning, consolidation, creative thought, etc. All of these physically reshape the brain by changing cells, making and pruning connections, and adding new cells and even new tracts connecting over significant distances.
This suggests that to simulate a brain, we have to simulate the actual growth and search processes of trillions of cells, in at least some cases down to the quantum level. This process may not be optimizable or compressible to any great degree — except in the way we “optimize” the flight of birds by building airplanes.
by Cameron
[quote]Not to simulate the brain we can’t. In fact, we can barely do this to predict the folding of very simple proteins. Scaling that up to simulate macromolecules such as ribosomes is currently beyond our capabilities.
[/quote]
It was only a few years back that I’d heard that petaflops systems could simulate protein folding in realtime with high level of accuracy. Recent improvements in algorithms have supposedly provided vast speed-ups thus lowering the processing requirements for real time simulations.
In any case assuming we need full molecular level simulations is folly in my book. Take vision for example, a highly studied system, if we went for full level molecular level simulations we’d have trouble getting past just the photoreceptor layer(full level molecular simulation of 10s of millions of cells)… yet all information points out that the functionality of the entire retina can be quite easily faithfully replicated with far far less use of computational resources.
We have no reason to believe this is otherwise elsewhere down the line. Receiving the center surround processed input from just a few ganglion cells, what exotic computation can we assume the proteins in a neuron in v1 is doing with it*(ignoring feedback and lateral connections for the moment)? IMHO, it doesn’t seem it is required nor would it make sense to do exotic elaborate processing of such simple incoming information, and in fact the experimental data suggests it doesn’t actually do exotic elaborate computations of such simple input.
Now going even further down the line what do we find, the structures are all pretty similar, suggesting by the structure-function connection heuristic of biology, that similar things are going on.
[quote]
So, your conclusion about how much computing effort would be required to simulate anything living cannot be based on your compression of the genome because it doesn’t correspond to reality even with the small problems we’re already working on.
[/quote]
Simpler genomes have been digitally stored synthesized and booted up in cells whose dna’s been eliminated. It’s quite reasonable to assume even the human genome can be fully compressed and digitally stored, and then simply decompressed, synthesized and put into a new cell and booted up.
You have to remember the entire body is built up from a single cell, whose information is mostly stored in DNA(including epigenetic info) and a few RNA molecules with different concentration gradients.
[quote]The bullet point from what I posted was that there is no such thing as a human brain before interaction with its environment, we are not a genetically programed automaton that then learns to be in our environment, learning is part of the building process. A Human brain before you have interaction with environment is like a house before you have walls or a roof, not much of a house at all.[/quote]
The amount of environmental information necessary to bestow general intelligence is not that immense. The biggest kid is vision, and that can be lost without hindering general intelligence, which is what we’re after. Consider the fact that even blind bed ridden kids can attain general intelligence, mainly through touch and exposure to voiced language. The sense of touch is no where near as high bandwith as vision, and a few years of sound with touch information is easily stored on a modern hard drive. Even blind-deaf individuals have attained mastery of language, and thus general intelligence, reducing the required amount of environmental data.
[quote]
Instead a brain takes years to build, because it takes years to run all the trials and invent new pathways by an iterative Darwinian process.[/quote]
Once you have the algorithms that provide the foundation for general intelligence extracted, simple exposure to data will allow the system to evolve as these algorithms allow the extraction of patterns and the creation of abstractions and virtually arbitrary associations to be made.
The brain’s algorithms flexibility to incoming data are vast, a slight change in a photoreceptor molecule, as occured with the evolution of colored vision, and it is able to immediately categorize the additional data and make meaningful sense of it. Some human women have gained a slightly different additional opsin molecule, and indeed there abilities in color discrimination seem to go beyond the norm, distinguishing as distinct two colors other humans see as being the same. Exposure of visual information transformed into either sound or touch information for blind individuals, has been appropriately interpreted by the brain as visual information and substantial functionality gained in this area.
by Dan Kaminsky
Alright.
First of all, I hope nobody else thinks calling a theory bunk, represents a personal attack on Ray Kurzweil. Sometimes, somebody smart gets the math wrong.
The “too long, didn’t read” summary: There’s no way to know a priori how many lines of code it would take to implement the functionality of a given protein, and without running a full scale simulation of the chemical universe of the brain, we’re already in a potentially unbound state of coding complexity before we’ve even even laid out the first axon.
So I actually *am* a reverse engineer. Not exclusively — in security, it’s always more productive to just sit down with the dev and look at his source — but sometimes you just gotta stare at some hex. I’m also someone who spent some time looking at genomics, as it turns out a couple algorithms from bioinformatics (dotplots, smith-waterman) are useful in other realms.
I’ve looked at OS binaries. I’ve looked at genes. I know enough to say the two are very different things. That which is emitted by a compiler was once C. That which is emitted by a genome is something else entirely.
The genome is not data. The genome is not code. The genome is both, utterly interwoven, with chemistry itself as the ultimate compiler.
Lets talk about compressibility for a second. The genome is compressible, this is true. Most of the compressibility however comes from the encoding of proteins. Take three base pairs, four possible values each: That gives you 64 possibilities. These are used to encode proteins, which are chained amino acids.
There are only 20 amino acids. It isn’t precisely true that the genome only contains canonical encodings for those 20 — variants also work, and there’s a stop and start codon — but it’s pretty close. So almost all of the redundancy comes from this.
What about the actual proteins that the genes encode? Well, I refer you to Craig Neville-Manning: Protein is Incompressible. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.16.1412&rep=rep1&type=pdf
OK, granted. This isn’t entirely true. Hategan and Tabus did come out with a reply paper, Protein Is Compressible, here:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.16.1412&rep=rep1&type=pdf
But as they say, they effectively got a negative result. And this was one layer in — just far enough to get to the protein encoding stage. And here’s where things get messy, in terms of lines of code:
Sure, you can write a single protein folder, and then run each sequence through it. But how do you determine what the end result does, how it behaves? Could you write a generic emulator, that simply simulated the chemical environment in which the protein existed?
OK, but it’s floating around inside a cell, with a number of other proteins it may or may not be interacting with. You rather quickly end up with an infinite computational load.
Now, lets not do that. Lets study the protein, determine its function, and simply write an emulator for that.
There. Right there, implementation complexity just become unknowable.
Larry Wall, creator or Perl, has a quote: It is easier to port a shell than a shell script. This, my friends, *is totally true*. If you can get away with just making the execution environment work in your new substrate, this is usually much easier than actually understanding the things operating in that environment. Except that’s not an option here — the computational load would be insane.
If we need to actually understand the protein behavior, for every protein, in every context, without the benefit of the original substrate…there really is no limit to how complex the logical expression of each protein’s behavior can get.
Nature just doesn’t optimize for simplicity. It doesn’t need to.
I will say, if there’s one real mistake Ray’s made, it’s saying that PZ’s responses aren’t worthy of a response. For a sheer line of code count, PZ’s pretty much dead on: Here are the things you are talking about emulating. Does this look simple to code? No, no it doesn’t.
And this is only the beginning. Those proteins actually build structures, that have their own complexity that is either chemically simulated or fully understood. Messy, messy.
by JDN
“The amount of information in the genome (after lossless compression, which is feasible because of the massive redundancy in the genome) is about 50 million bytes (down from 800 million bytes in the uncompressed genome). ”
This is the sort of statement that drives biological scientists crazy. Lets start with your underlying assumption that you can treat the genome like a character string and compress it. Those letters correspond to parts of a molecular machine that aren’t always interchangeable, even though they may seem interchangeable because you have assigned them all letters. A pattern of nucleic residues that can be compressed may serve radically different functions depending on circumstances. An example can be drawn from immunoglobulin-like domains in proteins. They have the same folding patterns and very similar sequences. You would be tempted to believe that their “information” can be compressed by simply storing the modifications to the consensus sequence. But the information needed to build a simulation of all proteins as part of a cell lies in their relationships of these domains to other parts of their host proteins, not simply in their sequence. Also, the particular nucleotides in these domains may play functional roles in expression, RNA editing, or post-translational modification, roles which are determined by the laws of physics and chemistry but not apparent in the sequence. Even single amino acid changes may lead to large changes in function of the machines which they are a part of. Both the relative position of these domains in their proteins and their point mutations would convey a huge amount of information, if only we knew how to simulate protein structure. Your opinion is like that of an evil manager (PHB) who believes, because he’s instructed his minions to build something, that he is in posession of all the information needed to build said object or even how said object works.
Our current feeble attempts to simulate protein structure and protein-protein interactions are far from giving us the information we want, with millions of lines of code already committed, not to mention parameter files, the data for semi-emprical methods, voluminous journal articles and the output of thousands of scientists. So, your conclusion about how much computing effort would be required to simulate anything living cannot be based on your compression of the genome because it doesn’t correspond to reality even with the small problems we’re already working on.
That’s only why you’re wrong on a very basic level. On a higher level, you face the same challenge as the ancient atomists when they argued that the motion of atoms determines all things and that all things can be reduced into these atoms. Even if we had an atomic-level readout of a living brain on a nanosecond timescale, we would be faced with the problem that we don’t know what consciousness is nor would we necessarily recognize it in the morass of data. Such information is not in the genetic code, but, it is exactly the information that we would need to interact with the brain in any meaningful sense. So, your X year prediction for a simulated brain is based on inadequate understanding. If it comes true, you’re still not genius.
There are many more examples of why your position is wrong. You should ask some of your biological scientist friends to write you examples out of their own experience. You are certainly not alone in your beliefs, and, such an exposition would be an important step in solving the problem of consciousness.
by Brain 2045
Check this 2 Astonishing lectures of Henry Markram:
(watch them in full screen)
1. http://neuroinformatics2008.org/congress-movies/Henry%20Markram.flv/view
2. http://ditwww.epfl.ch/cgi-perl/EPFLTV/home.pl?page=start_video&lang=2&connected=0&id=365&video_type=10&win_close=0
Very interesting article from the ‘Seed’ magazine:
Part 1:
http://seedmagazine.com/content/article/out_of_the_blue
Part 2:
http://seedmagazine.com/content/article/out_of_the_blue/P2
And a few more interesting links about the project:
http://news.bbc.co.uk/2/hi/sci/tech/8012496.stm
http://www.youtube.com/watch?v=Bz5IUaRr8No
http://www.youtube.com/watch?v=RLCT3wU4fek
by Itch
Ray, please correct two glaring mistakes.
A) a nucleotide has 4 (actually more states) so double your initial calculation.
B) if you want to talk about information theory you should know that the possible states of 3 billion bases is actually 4^3 billion, not 4*3 billion.
by greylander
(A) You do realize that you are correcting the article, not Ray — what you reference was not even in quotes in the article… so it was the authors attempt to paraphrase. Regardless, order-of-magnitude is what is important here, so a factor of 2 is irrelevant.
(b) By states you mean the number of possible unique sequences of 3 billion base pairs… how is this even relevant? Information content is based on the number of bits requrired to encode a thing… in this case your correction in (A) tells us 12Gbits or about 1.5GBytes. That is an upper bound on the information content, but no a lower bound. The minimum size of self-extracting compressed file you can shrink it to, on a machine (cpu, turing machine) with no special hidden knowledge of the original tells you the actual information content of the original data.
by Itch
Crap scrub that, misread the bytes bit.
by Itch
crap scrub that misread bytes
by dhbernstein
I doubt that our predilection for linear extrapolation is hardwired; software developers became quite dependent on the Moore’s law during the 1990s and early 2000s. It’s our predilection for linear thinking that’s the real obstacle.
A broadly-applicable approach to information processing with 16 cores has not fallen into our collective laps like the next Moore doubling. Our brains employ 10^11 neurons continuously interacting via 10^14 synapses. Understanding information processing at this degree of parallelism is the challenge.
by rsquare
“What I would say is that my critics underestimate the power of the exponential growth of information technology.”
The problem with this statement is that the brain isn’t just a computer science or information technology problem. It critically relies on empirical research, which is not accelerating like infotech.
You like to use genome sequencing as an example, but the acceleration of genome sequencing was caused more by market competition, after 1997, then by any improvements in the experimental techniques. Sequencing relies on two basic techniques: PCR and some type of electrophoresis to separate and identify the fragments. Neither of these were fundamentally faster in 2000 than in 1990, we were just running more reactions concurrently.
Understanding the brain will still rely on empirical research. We can allocate more money to do more simultaneous research, but if we don’t, it won’t be accelerating like information technology.
by greylander
rsquare,
On what basis do you say the research is not accelerating? Research is fundamental a matter of datagathering — how fast you do experiments, take measurements, reconsider parameters, do more experiements.
This is accelerating in every field. This is exactly why the genome project finished so fast, because ever better technique’s for automating the lab-work came online. To pick a non-computer-techy field at random, consider how zoology is impacted by cheap GPS tag/trackign devices, cheap digital video cameras, and whatever else. Cameras getting so cheap we could just about blanket a forest with them and have virtually unlimited data for studying animal behavior in the wild.
Google “lab-on-a-chip”… take a look at what is already out there and consider what is just over the horizon.
You say, “You like to use genome sequencing as an example, but the acceleration of genome sequencing was caused more by market competition … we were just running more reactions concurrently.”
First the incentives for the improvement (market) are irrelevant. There is always a market for innovation. Second, the a major reason for the ability to run ever more processes concurrently was robotic automation of the processes. One google search got me this as the first hit: “…a good portion of the work was performed on CRS automated lab systems.” (it happens to be a press release (I think) from a boasting company… but the point is made… you can find plenty more).
In other words, the lab processes for learning about all the molecular level processes in the body are accelerating… and fast.
by rsquare
greylander,
“Research is fundamental a matter of datagathering — how fast you do experiments, take measurements, reconsider parameters, do more experiements. This is accelerating in every field.”
There are many people in many fields who would disagree with you.
by RDW
‘The amount of information in the genome (after lossless compression, which is feasible because of the massive redundancy in the genome) is about 50 million bytes (down from 800 million bytes in the uncompressed genome)’
Can you point to a published algorithm that actually achieves this level of compression? I know of several papers that quote impressive compression ratios with genomic data (Christley et al claim to have compressed Jim Watson’s genome down ‘to a mere 4MB, small enough to be sent as an email attachment’), but when you read the small print it’s clear that they’re doing this with respect to a known reference genome; they only have to store how an individual genome differs from the reference. This is a potentially useful approach for the storage and transmission of genomic data, but hardly reflects the real information content of an entire genome. The recent algorithms I’ve come across that compress genomic data independently of a reference still require hundreds of megabytes to store a human genome. If you know of anything better, I’d love to hear about it…
by Brian H
Implicit information and hierarchies come into it, and the brain is nothing if not “layered”. The information content of the genome is specific to the characteristics of the chemistry used to implement it, as well.
There may be usable analogs in other “media”, but I doubt that electronics’ “millions of times” speed advantage is going to be directly translatable into neural network implementations.
by Xartec
@ Brian H: Why even doubt that? Try running a neural network on a 10 year old computer and compare it to running it on a computer build in 2010. Not to mention the advances in MPP will have.
by jzayner
Ray, not to be mean but you really have no idea what you are talking about when you talk about the genome. It is more than just a bunch of “bytes”. The genome may be composed of ~3 billion base pairs though you and other might think that most of these base pairs do nothing you are completely wrong. There is no “junk” DNA, regions of the genome that do not code for proteins. These regions can be involved in complex regulatory mechanisms or be the site for non-coding RNAs. DNA does not just turn into a protein, in fact there are so many complicated and complex mechanisms to regulate transcription and translation it astounds me how you plan to store this information. DNA can be chemically modified ala methylation which effects transcription. The 3D arrangement of can be changed to modify transcription. DNA can be alternatively spliced before transcription. Protein interactions with DNA are very complex, these proteins interactions can be change by chemical modification, ex. histone modification. Many of these things have also been found to be heritable. How do you plan on encoding all this information? I think you are in waaaayyy over your head.
The way neurons function is through proteins, how do these proteins work? What is their expression pattern like in each neuron? What ligands effect their function? If we do not even understand these basic questions how do you expect us to reverse engineer the brain?
by joe
Actually, jzayner you are waaaaayyy over your head. You don’t understand the basic premise here. No one is suggesting you model a brain in software by emulating the process of DNA, transcriptions, protein building etc, or even molecular interactions. Where did you get this idea? It sure isn’t written anywhere on this response by Ray. I’d like to know where you got this idea from. You sound like a chemistry student who doesn’t know anything about the basic concepts of informational theory, or networks, or software, or really anything about the subject being discussed here. We aren’t talking about modeling and chemistry set.
The point being made by Ray is that the *algorithmic functioning* of the brain is what can be reverse engineered. Obviously not all processes in the brain are critical to its informational processing components, a huge part of what going on there is related to supporting life and the structure of the neurons themselves.
A neuron does what a neuron does: it fires after a certain threshold is attained in its inputs. Does it matter that these inputs are modeled through proteins? Did anyone say that was important? Why can’t you build a software version of a neuron that behaves *exactly* like a human neuron? You know they already have done this right? They have. The behavior of neurons has been deeply modeled mathematically all the way down to non-linear activation potentials of ion channels and stuff.
Obviously the principles of neural networks work, that how we have computers that can recognize faces. The state of the art is breathtaking, look at the Blue Brain Project: http://bluebrain.epfl.ch/
Let’s read and understand the premise here before jumping all over this. Everyone here should read the Singularity book. Love or hate it, you should at least be able to understand the arguments premise before arguing about it.
by f1r3br4nd
“Obviously not all processes in the brain are critical to its informational processing components, a huge part of what going on there is related to supporting life and the structure of the neurons themselves.”
…and how do you propose to distinguish which brain process is which? And how will you know you were right? It’s one thing if all you want is *a* sentience. Quite another if you’re trying to replicate *your* sentience, or that of a loved one. And this is before we even get to the problem of continuity of consciousness.
by joe
We know because they are fairly obvious and this is a point not even worth discussing if people were up to date on the state-of-the-art here. We have reverse engineered several layers of the brains visual cortex, as well as created artificial versions of the audio processing functions of the brains based on neural reverse engineering.
This is fact. It can be done, and it will continue to be done. Humanity is reverse engineering sections of the brain right now, and we will continue to do it. They see the same neural firing patterns in their models that they see in a human brain. Get religion about this people, it’s reality not fiction. Your debates based on not even being able to properly model a neuron were proved wrong at least 10 years ago.
Here is the ultimately point: The problem of brain scanning and reverse engineering is one of an informational process. The big obstacles are speed and resolution but as I have said before we have been able to work around some of those issues and have successfully modeled the human auditory system (i.e. computers can recognize many different conversations in a crowded room of many speakers). Here’s the real deal about this, you can build this stuff based on these models without understanding how it works (even though it does), and the teams have to spend a lot longer trying to understand *why* it works by playing with their model. But they do work.
We have the know-how to model what we see, the problem is seeing it on size and time scales that make it possible, and this is Ray’s entire point: the scanning technology used is an iterative process just like all of technology and will improve geometrically over time, meaning we should be able to see and scan the firing of every neuron in a living brain in real-time by the mid 2020′s.
Based on computing power at that time a $2000 laptop will have the raw MIP’s necessary to simulate a human brain. There are a number of different methods used to estimate how many CPS are required to simulate a human brain but it has been show by Ray who has taken the predictions from many different angles (and from many different people) that even if these predictions are several orders of magnitude off, because of the nature of exponential growth this will push the time frames out only by 5 to 10 years.
Will this ultimately allow us to simulate a brain that is conscious? That is the million dollar question and depending on your point of view is totally possible, or totally impossible. I’ll leave my opinion to myself.
Even if you don’t want to believe this, I really wish people would actually debate the facts of the argument. If you didn’t even know we are already building functioning artificial neurons (you know they have actually replaced real neurons with semiconductor based ones to see if they worked like the biological ones and they did) then why are you even posting a comment here?
by DWCrmcm
“The big obstacles are speed and resolution but as I have said before we have been able to work around some of those issues and have successfully modeled the human auditory system (i.e. computers can recognize many different conversations in a crowded room of many speakers). Here’s the real deal about this, you can build this stuff based on these models without understanding how it works (even though it does), and the teams have to spend a lot longer trying to understand *why* it works by playing with their model. But they do work.”
They may work up to a point, but they are fatally flawed.
1) they are programed.
Design and programming are causally (cause) unrelated.
2) they are compiled.
They are translations, and they are devoid of evaluation and context
3) they are devoid of real concurrence.
There is no cpu in life.
In The Rational (aka the universe), every granule is a clock.
The closer you think you are the greater the correction will be.
by greylander
jzayner,
Ray did not leave out the non-protein-coding regions. He is considering the *entire* genome. The fact the he talks about what it can be compressed to does not change this. He is talking about *lossless* compression — no information lost. You need a better understanding of compression and information theory.
As to methylation, and other epigentic effecsr of the genome… it is the DNA itself that codes for how and when these epigenetic effects occur. So the information regarding how these things work is already in the genome.
Since Ray makes clear that he is not proposing simulating the brain by simulating its development from the genome, your question of how he will encode all the protein interactions and epigenetic factors is irrelevant to the discussion. The genome is only mentioned as one way to extimate or put an upper bound on the complexity (information content) of any description (simulation program) of the fundamental underlying processes of the brain.
“The way neurons function is through proteins, how do these proteins work? What is their expression pattern like in each neuron? What ligands effect their function? If we do not even understand these basic questions how do you expect us to reverse engineer the brain?”
We do it by looking at the brain… by looking at neurons, poking, prodding, and finding out how they work. You sese this as an obstacle because you think we can only do this slowly and will continue to do this slowly. You under-appreciate the degree to which lab processes will be automated in the very near future, enabling the gathering of huge amounts of data about what is going on at the microscopic/molecular scale. Also consider the vast improvements to imaging techniques that will be made over the next two decades. Getting the information you ask about will not be as hard as you think it is.
by f1r3br4nd
“You under-appreciate the degree to which lab processes will be automated in the very near future, enabling the gathering of huge amounts of data about what is going on at the microscopic/molecular scale. Also consider the vast improvements to imaging techniques that will be made over the next two decades. Getting the information you ask about will not be as hard as you think it is.”
What practical advantage is there to him starting to believe the problem is easier than he currently believes it to be? Will it make him work harder? Will it make him more cautious about what he promises the lay public?
I hate to say it, but it looks like a number of people here are confusing wanting something really bad with expecting that something to actually happen. That is irrational and counter-productive.
by Xartec
Not only brief but you oversimplified it as well. A working (biological) brain is ‘built’ in less than a year (less than 9 months for humans) and training a non-biological brain obviously doesn’t need to take years if it’s running on a system many many times faster than our own.
You might want to read the article again and watch closely for that second underlined portion. The part where you consider the brain to become functional won’t take years, it won’t even take months or weeks. Once we got the basics we’ll be able to teach it faster than any biological brain can learn to say “mama”.
I don’t think anyone will suggest a functional brain is just the unpackaged instructions of the genes – it does however imply a constraint regarding the maximum complexity and size of Brain OS 1.0.
by Opcn
@xartec, you think interaction begins at birth? I would posit that the brain is probably building its own novel solutions to test problems from the beginning, and that there is a continual process of interactions during development You may be able to simulate the brain stem of a 6 week old fetus with a reasonable amount of code, but that is not a human brain.
The bullet point from what I posted was that there is no such thing as a human brain before interaction with its environment, we are not a genetically programed automaton that then learns to be in our environment, learning is part of the building process. A Human brain before you have interaction with environment is like a house before you have walls or a roof, not much of a house at all.
by waltinseattle
xartec proclaims: “A working (biological) brain is ‘built’ in less than a year (less than 9 months for humans) and training a non-biological brain obviously doesn’t need to take years . ah, perhaps you have a less stringent definition of “working” than I do. In any case we understand that brains are still developing past the teen years. This is of course in the intormation handling portions, the portions that tell us how to behave, how to proceed in life. How to act human, to be wise, in short. This is not all set forth at birth, nor is the frame of interactivity between cells and outcomes/feedback reductable to biology, but depends more on the patterns that develope in the higher cortex. Teaching the brain to be a brain is the process of triming the expressed patterns to what works for the functional organism. Rather beyond the concerns of condensing code lines. More a global feedback statement of whether the patterns are usefull or not. Use is external to the protein synthesis. Can we describe the different developmental paths taken for a normative brain and a dyslexic brain? How can you code the model construct to mirror these massively and externally decided “utilities? Lots of data, lets call it history, resides in the neural patterns, not in the mechanism by which the patterns are generated. This is beyond the posited idea of a brain os and the question of how big or small that os can be. I think it is a question of “so you have a brain- now how do you employ it as an intelligence?” Isn’t that the holy grail of “artificial” intelligence? to use the brain like a human brain, and not as a computer that can mimic it. “Where” does the code for teaching reside? genome or the big messy world of interactions and feedbacks and “self evaluations?”
sorry to come on so off topic.
by DCWhatthe
Nobody understands the brain yet. When we DO have nearly a complete understanding of how it works and how to create one from scratch, only THEN can we claim that Kurzweil or another person understands the brain better than this or that expert. This ‘Ray Kurzweil does not understand the brain’ is just ad hominem nonsense.
Ray seems to be saying only that the design of a one-second-old brain is based on the genome, along with the equivalent of the process of 9 months of pregnancy. That may be true. It sounds plausible.
But that’s only a newborn brain. To produce a healthy intelligence, seems to require training – those ‘Darwinian processes’ which Opcn referred to. Unpredictable external circumstances play a large role in that training, at least in our lives.
Two of the questions that pop up:
1) Should those external circumstances be thought of as some of the sufficient primitive components that will produce a reasonably intelligent healthy brain? That seems obvious, but determining those circumstances and weading out the ones which produce no value, is another obstacle altogether.
or
2) Is the real challenge to eventually discover and build a genome, which produces a robust brain that can handle a variety of challenging external circumstances, and become stronger and smarter as a result, without shutting down or becoming Hitler?
by kdeloske
This is typical. Any “expert” loves to point out why a theory about their field is wrong. For example:
Pizza Theorist – “I predict within 5 years there will be double the amount of toppings on pizza.”
Pizza Chef – “What I do is very complex and your prediction shows your limited understanding. Doubling the topings forces you to double the baking time and will obviously resut in a burned crust. You obviously know nothing about pizza and I am the only person brilliant enough to understand it’s complexity.”
Pizza theorist – “I believe methods for precooking topings or varying baking temperature will allow for additional toppings and unburned crust.”
Pizza Chef – “Precooking topings will destroy the overall flavor and varying the baking temperature will interfere with the crust rising appropriately. No innovation should ever be expected because new ideas and different perspectives are useless when it comes to something as complex as pizza.”
by Jay27
@Dan Kaminsky
I don’t think Ray needs to be told what is bunk and what is not. I’m guessing that, if you would ask nicely, Ray would provide you with a perfectly rational explanation for everything.
People throw insults at Ray, even though this is completely unnecessary. You could simply ask questions. And, provided Ray really doesn’t know what he’s talking about as you suspect, he would quickly paint himself in a corner.
At no time does a person ever need to insult another person. Questions is all it takes.
You can prevent creating frustration in other people and making a fool out of yourself, by simply asking questions.
So let me be the first to politely ask the question:
Ray, can you explain to us why you think there is a correlation between a number of bytes and a number of lines of code?
by BoomWav
He only says that many people overestimate the complexity of re-engineering the brain, pointing out that even with all it’s inefficiency, the body create it from DNA, data that contains a lot of redundant information. It doesn’t prove anything.. he’s just pointing it out..
by dragger2k
Ray may be correct in the future ability to reverse engineer and reproduce the human brain…
but that is not the problem…
the problem is that a human brain does not equal a human mind…
the mind does not exist in any physical location and it has nowhere near been proven that the mind is a property or result of the brain and it’s activity, alone…
by Night Jaguar
“For starters, I said that we would be able to reverse-engineer the brain sufficiently to understand its basic principles of operation within two decades, not one decade, as Myers reports.”
In this video:
http://singularityhub.com/2010/01/25/kurzweil-discusses-the-future-of-brain-computer-interfaces-at-x-prize-lab-video/
at 9:50 you say:
“I believe we will actually understand the human brain. We will reverse engineer it by 2019.”
Was that an error?
by Sharkey
Ray: “It is true that the information in the genome goes through a complex route to create a brain, but the information in the genome constrains the amount of information in the brain prior to the brain’s interaction with its environment.”
You’re missing PZ’s point: the epigenetic factors in the development environment are complicated and can’t be handwaved away. Brains don’t develop in a vacuum; brains develop in a complex environment of proteins that help pattern the emerging organ in complicated ways.
In terms of (Kolmogorov) information theory: you are counting the bits of the program, but ignoring the bits used to describe the computer the program runs upon. By Church-Turing, it is likely that the epigenetic “machine” used to develop the brain is isomorphic to the Turing machine, but my guess is the description of the brain’s gestational environment requires (at least) as many bits as the original genome.
by tim333
The Genome argument seems good to me
There’s an article on wikipedia on information theory for those who don’t understand it
http://en.wikipedia.org/wiki/Information_theory
by Dan Kaminsky
Ray,
You may have many other arguments that hold water re: the reverse engineering of the brain — our ability to image in-vivo, our ability to use smarter substrates, the underlying redundancy of the brain, etc.
But your genomic “argument from Information Theory” is bunk.
We’ve got a couple problems here. First, we’re just not sure how much of the genome is actually used as an information source. So called “junk DNA” keeps being found to be not exactly useless, and epigenetic methylation is a tremendous modulator of gene expression. Nature has never had to conserve for complexity, and so it simply hasn’t.
Second, and nastier, is that of opcode complexity. Even if you constrain the genome to fifty million bits, you cannot assume there is a trivial mapping between bits and lines of code. Information theory will not help you here; a fifty million bit message has 2^50,000,000 possible interpretations. An equivalent statement would be “How many lines of code would it take to render this frame from this movie”. Er, it rather depends on the precise nature of the frame, from a white screen to line art to Toy Story 1 to Toy Story 3 to a fully photorealistic scene, the answer differs by many, many orders of magnitude.
The final issue ts that while nobody is suggesting a reimplementation of the brain has to include all the legacy chemical mechanisms, there’s an enormous number of message carrying chemicals and impulses going on in there, and we just don’t know what they all are or mean. Would consciousness still exist without all the various receptors and neurotransmitters?
It’s a mess in there.
by greylander
Dan,
You are overlooking some important points. The epigenetic processes you mention are almost certainly coded for in the genome. It is genes (in the generalized Dawkinsian sense which tell the machinery of the cell what to turn on and off. So likely is is all of mostly there in the DNA. This does not completely rule out important information in the cytoplasm of fertilized egg which is not already in the genome, but I’ll wait to address why I don’t think there is much if you question this.
As to all that “junk” DNA, Ray has included it in his estimate. He is estimating the information content of the entire genome, not just the parts we already know do something useful. The estimate is based on data compression. With some important caveats, the information content of a string of symbols (which can always be re-code into binary without lost of information) is equal to the smallest size to which the data can be losslessly compressed. The important caveats are this: you must include the size of the “decompressing program”, and you must also be sure that the machine on which the decompressing program runs does not hide any special knowledge about the data being compressed. It is safe to say that normal computers contain no special knowledge of the human genome. The minimum code for the decompression is only a few K. We already have the code of the genome and have compressed it.
Ergo, we know that the measure of information in the human genome really is around 50MB. This is a rock solid fact.
Next, while it is obvious that what an organism does, growing, learning, eating, composing symphonies, depends intricately on it’s interaction with the environment and a great deal of information is added to an organism as it grows/develops/learns. However, it is not evident that interaction with the environment significantly alters the fundamental processes by which an organism incorporates information from the environment. In this sense that environmental influences are data which the “fundamental processes” manipulate and store — this is even true if the stored data itself are new learned processes, like how to play a piano. The fundamental processes have to do with such things as how and why the synapse grow, change, and shrink, or how the concentration of one neurotransmitter influences the rate at which the concentration of another neurotransmitter increases or decreases within a synapse. Does environmental input significantly alter these fundamental processes? That is a claim requiring some serious evidence. (Trotting out mind altering drugs is a red herring… no one suggests that at the 20 year mark we will model the effect of every chemical you could possibly introduce into the brain).
If those fundamental processes are governed by the genome, then the actual information content of any description those processes is less that 50MB (though of course it could be code less efficiently). However the program to emulate these processes will just be a number crunching program — basically amounting numerical integration of a bunch of partial differential equations. We can write number crunching programs very efficiently, both in human readable and machine readable form. For the binary version, ‘efficiently’ means that the size of the program will not much larger than its actual information content. From the arguments above, the information content will not be more than about 50MB, so the binary program will also be around that size. Finally in estimating “lines of code”, this is just a rough rule-of-thumb type of estimate. There is nothing fancy about writing code for straight up number crunching, so supposing somewhere in the range of 10 to 100 bytes of binary per line of human readable is not unreasonable. This all boils down to a ballpark figure of “million lines of code”. This is an off-the-cuff, ballpark, “order of magnitude” estimate. If it turned out to be a billion lines of code, or even 100 million, that would represent a big mistake in this estimate… but anywhere from 100K to 10M lines is covered here.
We do know that it is a “mess in there”. That’s a given. Reverse engineering all the needed differential equations and related functional rules is certainly not trivial. With present methods available to neurobiologist, all they can do is pick at this problem. Just discover what amounts to a new variable is ground for publishing a new paper as something important (such as discovering a new type of neurotransmitter, the concentrations of which at each synapse would be a variable for the equations to crunch). What is not accounted for by the pessimists that the degree to which automation of lab processes and other data-gathering is going to explode the data we have available about these processes.
Google “lab-on-a-chip” and get a sense of state of the art there. What prototypes are already out there. (Lab-on-a-chip does not mean everything a lab does will be on a chip… it means the ability to automate huge numbers of experiments on chemical reactions and microbial organisms (or microscopic bits of larger organisms, such as individual or small groups of neurons). With this kind of data finding the right equations — the “rules” that govern how the brain learns/adapts in response to environmental experience — will be trivial.
There is no proof of the 20-year prediction other than waiting 20 years. Most skeptics scoff at the 20 year prediction because they either think “impossible” or they think centuries or milliena. So even if it take 50 or 80 years to achieve what Ray has predicted in 20… the sneering naysayers will have been shown quite wrong.
Who is right really just depends on who has a better intuitive grasp of the “big picture” of how fast technology is advancing and how complex the problem really is.
by mmmmhack
“However the program to emulate these processes will just be a number crunching program — basically amounting numerical integration of a bunch of partial differential equations. We can write number crunching programs very efficiently, both in human readable and machine readable form.”
Not to simulate the brain we can’t. In fact, we can barely do this to predict the folding of very simple proteins. Scaling that up to simulate macromolecules such as ribosomes is currently beyond our capabilities. So talking about simulating a macroscopic structure such as the human brain is ludicrously beyond our current capabilities. The limitation is not computer hardware, the limitation is our current intellectual capabilities for writing software to simulate biological processes on massively parallel computer systems.
by rcmoore
“An equivalent statement would be “How many lines of code would it take to render this frame from this movie”. Er, it rather depends on the precise nature of the frame, from a white screen to line art to Toy Story 1 to Toy Story 3 to a fully photorealistic scene, the answer differs by many, many orders of magnitude.”
Are you confusing the algorithm which does the rendering with the data to be rendered? Because clearly, the rendering algorithm represents a fixed (or at least maximum) number of lines of code that must be executed. This means a fixed number of opcodes. What may change is the number of times the opcodes may need to be repeated to process more complex data sets, but this is no way increases the complexity of the process, just the processing time.
And this I think is exactly what Ray is saying — once we get the algorithm, all the perceived complexity of the brain goes away. And I agree with Ray — due to the exponential increase in the information we are gathering about the brain, the potential for working out its algorithmic structure is increasing exponentially.
Once worked out, the problem then becomes that of processing power, an area that clearly seems feasible if a few decades.
Ray’s use of the genome is an excellent example — original estimates were off because they did not fully account for the benefits of understanding the algorithmic properties of the genome — they merely looked at the amount of data.
by CoolFUNKleMAN
the twinkly lights in the sky at night appear to be a mess to the non-astronomist..
by Opcn
I sent an e-mail about it, and was directed here by one of your capable employees, but I think the genome argument is a fatally flawed one. I’ll be brief here, but a functional brain is more than just the unpackaged instructions of the genes; were that the case a working brain could be built in a couple of years. Instead a brain takes years to build, because it takes years to run all the trials and invent new pathways by an iterative Darwinian process.
by rcmoore
“Instead a brain takes years to build, because it takes years to run all the trials and invent new pathways by an iterative Darwinian process.”
Like Dan Kaminsky, you seem to be confusing the algorithm with the data itself. For example, if I possess an algorithm that produces a sequence of genetic code that when translated gives rise to a specific desired protein, I can use that algorithm. I can then potentially modify the algorithm to produce different desired proteins. I can in effect, in a small amount of time reproduce the net effect of millions of years of evolution by natural selection.
The brain is the same — if I have the algorithm, and I now the goal (an artificial brain capable of performing some predefined task) I can in theory do in a relatively short time what natural selection did by an undirected process of trial and error.
Ray is clear in stating that he understands that the functional brain is “more than just the unpackaged instructions of the genes”, he highlights the need for data (from the environment, for example) quite clearly. But once the algorithm is implemented, the data becomes a problem of processing time and power, which will continue to increase exponentially
The confusion on Ray’s position here is quite baffling — perhaps the problem is that detailed knowledge of both biology and computer science is required to understand the issues.
by taupring
I believe that what Ray is getting at is that all of the information needed to create the “program generator” is contained within the genome and that in its compressed format that amounts to ~50MB. The genome can be seen as the instruction set for the program generator. Once executed, the basic circuits of the brain are instantiated. The nature of program generators is that extremely complex code can be generated from much simpler instructions. The issue with not being able to create a brain from the genome instructions is more of us not understanding the macro language of the program generator (PG). For instance, there is probably the concept of a loop, which is used to create massive amounts of identical neuronal circuits. There could be if..then constructs. We just don’t know enough yet in how the PG works. But, that doesn’t change the fact that somehow all of the complexity of the initial wiring of the brain is passed on by way of the genome. This doesn’t say that this constitutes a fully developed, intelligent being. But it does instantiate the basic neuronal wiring needed to begin the process of learning and adapting.