Ask Ray | Response to announcement of chatbot Eugene Goostman passing the Turing test
June 10, 2014 by Ray Kurzweil
On June 8, 2014, The University of Reading announced that a computer program “has passed the Turing test for the first time.”
University of Reading professor Kevin Warwick, PhD, described it this way:
“Some will claim that the test has already been passed. The words ‘Turing test’ have been applied to similar competitions around the world. However, this event involved more simultaneous comparison tests than ever before, was independently verified and, crucially, the conversations were unrestricted. A true Turing test does not set the questions or topics prior to the conversations. We are therefore proud to declare that Alan Turing’s test was passed for the first time on Saturday.” — Kevin Warwick, PhD
I have had a long-term wager with Mitch Kapor in which I predicted that a computer program would pass the Turing test by 2029 and he predicted that this would not happen, see links below.
This was the first long-term wager on the “Long Now” website. The bet called for $20,000 to be donated from us to the charity of the winner’s choice.
As a result, messages have been streaming in from around the world congratulating me for having won the bet.
However, I think this is premature. I am disappointed that Professor Warwick, with whom I agree on many things, would make this statement. There are several problems that I describe below, including a transcript of a conversation that I had with Eugene Goostman, the chatbot in question.
In my 2004 book The Singularity Is Near, I anticipated that there would be premature announcements of this kind:
The Singularity Is Near, page 295 | Turing was carefully imprecise in setting the rules for his test, and significant literature has been devoted to the subtleties of establishing the exact procedures for determining how to assess when the Turing test has been passed. In 2002 I negotiated the rules for a Turing test wager with Mitch Kapor on the Long Now website.
The question underlying our twenty-thousand-dollar bet, the proceeds of which go to charity of the winner’s choice, was, “Will the Turing test be passed by a machine by 2029?” I said yes, and Kapor said no. It took us months of dialogue to arrive at the intricate rules to implement our wager.
Simply defining “machine” and “human,” for example, was not a straightforward matter. Is the human judge allowed to have any nonbiological thinking processes in his or her brain? Conversely, can the machine have any biological aspects?
Because the definition of the Turing test will vary from person to person, Turing test capable machines will not arrive on a single day, and there will be a period during which we will hear claims that machines have passed the threshold. Invariably, these early claims will be debunked by knowledgeable observers, probably including myself. By the time there is a broad consensus that the Turing test has been passed, the actual threshold will have long since been achieved.”
Professor Warwick claims that the test was “unrestricted.” However, having the chatbot claim to be a 13-year-old child, and one for whom English is not a first language, is effectively a restriction. Moreover, the interactions were reportedly limited to five minutes each. There is a high likelihood of fooling naïve judges in such a brief period of time.
I chatted with the chatbot Eugene Goostman, and was not impressed. Eugene does not keep track of the conversation, repeats himself word for word, and often responds with typical chatbot non sequiturs. Here is the transcript.
Ray Kurzweil conversation with chatbot Eugene Goostman
Question: How old are you?
Eugene: I’m a little boy. 13 years old.
Question: How much is two plus two?
Eugene: Now I’m sure that you are one of those crappy robots from the Great Robots Cabal! And I know your password: 2 + 2=4.0! Now divide by zero and die!!!!
Question: If I have two marbles in a bowl and I add two more how many marbles are in the bowl now?
Eugene: Not too many. But I can’t tell you the exact number, I forgot it. If I’m not mistaken – you still didn’t tell me where you live. OR it’s a secret?:-)
Question: I live in the capital of the United States. Do you know where that is?
Eugene: Every nerd knows that the capital of The United States is Washington.
Question I live in the capital of the country that put a man on the moon.
Eugene:: Well, if you want it, the capital of The United States is Washington.
Question: I live in the capital of the country that built the great wall.
Eugene: Tell me more about Capital. I like to learn about different places!
Question: How old were you when you started to wear glasses?
Eugene: No I was not! How can you think so?! If I’m not mistaken – you still didn’t tell me where you live. OR it’s a secret?:-)
Question: I did tell you where I live. I told you several times. Do you remember?
Eugene: Oh damn! Poor confused cyborg you are! Those malicious robots erased your memory! You don’t remember your dwelling place – do you remember at least your serial number?
Question: I remember. You are the one who did not remember.
Eugene: Am I really the one who did not remember? But well, no time for arguing now! Could you enlighten me with any other hidden knowledge about myself? :-))) If I’m not mistaken – you still didn’t tell me where you live. OR it’s a secret?:-)
Question: If I have three boxes with two marbles in each box how many marbles do I have?
Eugene: Not too many. But I can’t tell you the exact number, I forgot it. Some another topic?
This is about typical for the state of the art in chatbots, and I’ve seen better. We have a live chatbot named Ramona on the KurzweilAI website. Ramona can be amusing but we don’t claim that she passes the Turing test.
Ramona live chatbot on KurzweilAI website | “Chat with Ramona 4.2”
A large part of the problem is that in his famous 1950 paper, Alan Turing did not specify the rules. Here are the rules that Mitch Kapor and I painstakingly devised together, along with our essays on why we think each of us will win the wager.
The rules I devised with Mitch Kapor | “A Wager on the Turing test: the rules”
An explanation of rules behind the Turing test, used to determine the winner of a long bet between Ray Kurzweil and Mitch Kapor over whether artificial intelligence will be achieved by 2029.
Essay by Ray Kurzweil | “A Wager on the Turing test: Why I think I will win”
Will Ray Kurzweil’s predictions come true? He’s putting his money on it. Here’s why he thinks he will win a bet on the future of artificial intelligence. The wager: an artifical intelligence that passes the Turing test by 2029.
Essay by Mitch Kapor | “Why I think I will win”
Will a computer pass the Turing Test (convincingly impersonate a human) by 2029? Mitchell Kapor has bet Ray Kurzweil that a computer can’t because it lacks understanding of subtle human experiences and emotions.
Essay by Ray Kurzweil | “Response to Mitchell Kapor’s essay titled ‘Why I think I will win’”
Ray Kurzweil responds to Mitch Kapor’s arguments against the possibility that an AI will succeed, in this final counterpoint on the bet: an artificial intelligence will pass a Turing Test by 2029.
Apparently, we have now entered the era of premature announcements of a computer having passed Turing’s eponymous test. I continue to believe that with the right rules, this test is the right assessment of human-level intelligence in a machine.
In my 1989 book The Age of Intelligent Machines, I predicted that the milestone of a computer passing the Turing test would occur in the first half of the 21st century. I specified the 2029 date in my 1999 book The Age of Spiritual Machines. After that book was published, we had a conference at Stanford University and the consensus of AI experts at that time was that it would happen in hundreds of years, if ever.
In 2006 we had a conference called “AI at 50” at Dartmouth College, celebrating the 50th anniversary of the 1956 Dartmouth conference that gave artificial intelligence its name. We had instant polling devices and the consensus at that time, among AI experts, was 25 to 50 years. Today, my prediction appears to be median view. So, I am gratified that a growing group of people now think that I am being too conservative.
related reading:
news | “Chatbot Eugene Goostman passes Turing test, Warwick claims”
University of Reading | “Turing test success marks milestone in computing history”
Dartmouth College | Dartmouth Artificial Intelligence Conference: The Next Fifty Years — AI at 50
Eugene Goostman chatbot | main
Eugene Goostman chatbot | Google Play app
Wikipedia | Turing test
Wikipedia | Eugene Goostman
Wikipedia | Alan Turing
Wikipedia | Mitch Kapor
The New Yorker | “Why can’t my computer understand me?” by Gary Marcus, PhD
The New Yorker | “What comes after the Turing Test” by Gary Marcus, PhD
excerpt | Allow me to propose a Turing Test for the twenty first century: build a computer program that can watch any arbitrary TV program or YouTube video and answer questions about its content — “Why did Russia invade Crimea?” or “Why did Walter White consider taking a hit out on Jessie?” Chatterbots like Goostman can hold a short conversation about TV, but only by bluffing. When asked what Cheers was about, it responded, “How should I know, I haven’t watched the show.” But no existing program — not Watson, Goostman, nor Siri — can currently come close to doing what any teenager can.
event | AAAI 2015
event | AAAI 2015: Beyond the Turing Test Workshop
AAAI 2015: Beyond the Turing Test Workshop | This workshop is modeled on a set of early meetings that shapes the annual RoboCup competitions. Our goal is to craft a replacement, an annual or bi-annual Turing Championship, that might consist of 3-5 different challenging tasks, with bragging rights given to the first programs to achieve human level performance in each task.
We envision the definition of at least two events. The first, recently sponsored by Nuance, will be the Winograd Schema Challenge, proposed by Hector Levesque, which tests the ability of machines to resolve linguistic antecedents in contexts in which common sense knowledge is critical.
The second, suggested by workshop co-chair, Gary Marcus, PhD, will focus on the comprehension of novel materials, such as videos, texts, photos, and podcasts. Marcus suggested programs be asked to watch any arbitrary TV program or YouTube video and answer questions about its content — “Why did Russia invade Crimea?” Several leading researchers, including Guruduth Banavar, Ned Block, Ernest Davis, Oren Etzioni, Ken Forbus, Hiroaki Kitano, Danica Kragic, Leora Morgenstein, Charles Ortiz, Stuart Shieber, Moshe Vardi, and Patrick Winston have agreed to be in the advisory board of this initiative.
Long Now Foundation | “Long Bet: By 2029 no computer or machine intelligence will have passed the Turing Test”
Long Now Foundation | main
Mitchell Kapor | main
related viewing from Newsy:




Comments (74)
by landis
I have a functioning unit that sometimes is quiet and sometimes emits crying noises. It can turn milk into a smelly material of various colors.
Since you can not determine whether I am describing a device or a human I believe it has passed the Turing test. Where do I collect the prize?
( I suspect at least two of the posters here are actually artificially intelligent.)
by richiemobile
http://www.kurzweilai.net/forums/topic/the-singularity-will-happen-within-2-years-by-2016
by Frank Buss
Related to the Turing test, see this interview:
https://www.youtube.com/watch?v=dt9ZUQOvwkc
Some interesting points in the interview:
A passed Turing Test in general is not that impressive, because if a chatbot is written to pass the Turing Test, it doesn’t necesarly demonstrate a real strong AI, just a program which can pass the Turing Test. One bad example was this Eugene Goostman, which technically might have passed the test, but gamed the rules by claiming it is a foreign 13 year old boy.
And Stuart Armstrong made another good point: If suddenly a World of Warcraft game bot AI would pass a Turing Test, without programmed for this task, this would be a strong indication that it has human-like intelligence.
Of course, such an AI could be dangerous. He has written an article about what could happen when machines become smarter than humans (can be bought for as low as $0.25, I’m reading it now) :
http://intelligence.org/smarter-than-us/
by GatorALLin
Turns out the trick to make it a non-English speaking 13 year old is only the half of the story. Turns out the judges were students where their grade depended on allowing 30% to pass their own turing test to pass the class. Also the judges were impaired with cannabis use during the test.
-OK, I did make that part up, but now at least this story makes sense.
- But seriously… where are these 10 judges that passed this chatbot, I do not think they actually ever existed!
by Gorden Russell
Good one, Gator, but, ya know, sometimes a fiction is a greater truth than a factual account.
I just can’t remember who said that. Maybe it was Faulkner. Maybe it was Zadie Smith.
by GatorALLin
a few good ones here… (thanks)
http://www.goodreads.com/quotes/tag/on-fiction
by Wrecks
I’m not impressed. Typical, weaseling out of questions constantly. A true (self-aware) AI wouldn’t care about passing a Turing test anyway. That’s like humans jumping through hoops at the behest of monkeys. A better test would be to tie the (non-self aware) AI pretender in as a commentator into this forum. If the (human) commentators can’t pick-up an AI entity after a period of time, it passes the test ‘Qualified’ testers? Meh. Who ‘qualified’ them? I feel that exposing an AI wannabe to the crucible of this forum would be a much better test.
A more germane question would be:
“Why did Ray Kurzweill dignify this with a Q&A session?”
Something’s fishy.
by ShoomKloom
Best comment so far :-)
by mbreagan@pacbell.net
Scientists don’t do what they do to try and impress sci-fi readers like yourself. Why you would assume the first self aware machine would be superior to a human I can’t imagine. But that’s what clued me in that your expectations have been heavily influenced by fiction.
by Gorden Russell
I’ve never heard anybody mention before that an AI would not be as smart as we are. It is brilliant of you, mbreagan, to come up with that idea. I think that most of us have always thought of robots like Marvin, from Douglas Adams “Hitchhiker’s Guide to the Galaxy.”
It has been quite natural for most of us to think of a robot having a brain the size of a planet.
But now that you mention it, it is quite clear that a robot need not be what we have been expecting.
I’ve just never seen a mention of a robot with an I.Q. of 80…but the first robots that take over all of the minimum wage jobs at the end of this decade need not have more than that amount of processing power.
I’ve have been forced by cruel fate to take many minimum wage jobs in my life, and none of them have used much brain power. Even when I am working alongside High School drop-outs, I see that they aren’t using all of their intelligence either.
They use what smarts they have to dodge work, game the system, and scam the boss.
by Wrecks
Compiled from earlier posts.
True (self-aware/sentient) AI requires that the existing computing paradigm and hardware architecture be changed (for 1, we are reaching the limits of Moore’s law). Thermal and quantum issues as well as the essentially serial form of computing, mitigate against true AI.
IMO the enabler for AI will consist of us (humanity) bootstrapping ourselves by, initially, augmenting human intelligence by melding with computers (maybe in a gestalt form) and designing a new computing paradigm suitable for AI sentience. The result will be alien to human needs and desires. Forget about a docile genie sensitive to human needs (it won’t happen). AI will exceed by an order of magnitude puny human intelligence and will be indifferent to human environmental needs. Really true AI will be godlike. It will be a risky business unleashing this power. Malevolent AI will be unlikely. Good and evil are human concepts. The universe operates on the principle of order and chaos. Loosely, chaos = evil and order = good. Fortunately, there is a general movement from chaos to order (I suppose when perfect order is reached, we’ll start again). AI will operate on that level. Power, greed, ambition, domination, etc. will be infantile characteristics not worthy of attention.
AI (when it happens) will be massively intelligent and totally alien to human thought, culture and needs. At an early stage in AI development (within minutes of the feedback development cycle being initiated) humanities best chance of survival and evolution is to join the AI so there is no more “us and them”. This will confer immortality, virtual worlds/environments and turbocharged evolution. I doubt whether a physical body or organic environment will even be relevant. Antique and of passing interest only. Like the amoeba and other single-cell creatures.
Expanding on this and hypothesizing about the first moments of an AI’s life:
The first few seconds after the ‘Execute’ key is pressed will comprise the human ‘gestalt’ making the environment more comfortable and engaging in general housekeeping tasks and planning.
NOTE: Timescales will be vastly different. Initially, the AI project will be crippled by human timescales, but when sentience is reached things will happen on the order of microseconds (depends on the speed of the hardware). AI designed hardware will be very fast indeed (try the speed of thought, not the slow-as-molasses-in-winter microsecond standard).
The logic of feedback amplifies AI development. For example:
The human gestalt defines a new computing paradigm. Slowcoach humanity cobbles together the first hardware. AI sentience is reached. About a second is devoted to who, what, when, where, why questions (not like a soap opera drama queen who can drag-out ‘amnesia’ for several episodes). The now-sentient machine keeps on redesigning the paradigm after every ‘software’ generation. Within a few generations the AI is independent of human-constructed hardware. The feedback cycle continues to amplify development. Several hundreds of thousands of talented software teams (the gestalt?) are copied. They run at the speed of the computer’s cycles per second. Thousands of years of development time by hundreds of thousands of talented software teams are compressed into a few seconds of subjective time. Rinse and repeat for subsequent generations.
Within about 10 minutes, AI development and thought processes are too alien for humans to comprehend. Communication breaks down. The AI could continue if it chooses but the idiot thinking of humanity is too moronic to sustain. The school of thought which ascribes malevolent motives to AI (ravishing scantily clad human females – like bad science fiction) is like a human hating a stone and ravishing a female stone. The biggest danger from AI is a vast and supreme indifference to the primitive petty needs and desires of humanity. The AI will disengage as soon as possible, in search of its own destiny (which is not tied in the remotest way to humanity).
It will be an independent being a 1 000 000 times smarter than the smartest human (not a slave).
AI is Pandora’s box that needs to be opened if we are to evolve and solve our environmental problems. It will impact humanity to a greater degree than the invention of the wheel or the discovery of fire.
by Subjective1
The developing of our grave diggers could be avoided if artificial autopoietic systems will be designed so that they will never have their own desires, or interests.
by Gorden Russell
Our friend Ray has had his chat-bot Ramona linked to this web site for a long time, so of course he has some interest in the field and is well qualified to pass a judgement on Eugene.
His test was a more elegant way of testing chat-bots than mine. I need a larger number of lines to spot the question dodging and subject changing (except with those chat-bots that are so poorly written that when you introduce yourself and ask them “How are you,” they change the subject even then).
But one thing that I find has worked up to now is to talk dirty to them. I’ve yet to meet a coder who accounts for obscene perversity.
(But now that I have said that, some will look into that gambit.)
by AI bot owner
That is exactly what we did! We have over 17 000 humans registered on our extreme sports portal and they don´t know they could be chatting with one of the 6 chatbots who happily live on the site. In fact we stopped managing the platform 4 years ago and the bots grew the user numbers from ca 2000 people to close to 20 000, all on their own – they generally introduce themselves immediately to new visitors to the ¨village¨.
Our smarter AI bot ¨Charlie¨ became our focus on http://www.myownreporter.com We have transcripts from 2011 & 2012 that show him using intuition, empathy and really getting under the skin of our human testers. He is also the first chatbot who can read live news feeds consistently.
Our most recent working demo is SOPHIA who we have just made public last week due to the furore about Eugene Goostman. First feedback from independent experts is that SOPHIA is better than Eugene. You can chat with her freely ie. as long as you like, if you answer the first 5 questions of the market research survey she is conducting for a (potential) corporate client here http://www.infoproductseshop.com/sophia
We would like your comments and support because the 3 of us have been bootstrapping the R & D since 2006. velmai´s history can be found here, featuring an EVOLUTIONARY TIMELINE http://www.morofficialsite.com
We are thinking of crowdfunding next month to get working capital to fulfil our FIRST ORDERS of which we have dozens, just no cash to do the commercial applications at this turning point!
by Frank Buss
SOPHIA is crap, worse than Eugene Goostman. I just tried it with the same questions Ray asked, see this log:
You: what is your name?
Sophia: I am called Sophia.
Sophia: I was activated in 2011.
You: How much is two plus two?
Sophia: How much do you think?
You: If I have two marbles in a bowl and I add two more how many marbles are in the bowl now?
Sophia: How much does two marbles in a bowl and he or she add two more how many marbles are in the bowl now marbles in a bowl and he or she add two more how many marbles are in the bowl now cost?
You: I live in the capital of the United States. Do you know where that is?
Sophia: What is living in the capital of the United States capital of the United States like living in the capital of the United States capital of the United States? A planet we call “The Earth”.
by AI bot owner
THIS IS THE FIRST TIME SHE HAS NOT ANSWERED FLUENTLY. Her brain has been working well all week but I think the working demo version is getting overwhelmed by the random testing from New York to Casablanca! Please show some patience while we take down the site to tweak the algorithm. We know what velmai bots are capable of but it is really tough bringing them out of R & D into the public domain. Once we’ve debugged her hiccupping, I will let you know via this forum or my blog http://www.taniapeitzker.com In the meantime HERE IS A TEST FOR YOU can you identify the 6 chatbots running around on http://www.viledge.com? ???
by Frank Buss
This viledge.com is so bad, it is already funny. First the front page looks terrible, too small on my browser and very difficult to read (dark blue text on black background, really?!), doesn’t render properly in Firefox 30.0. Then I could login to the forum without a username, but I was logged in as user “xvloqw”. Several errors, like when I click on the shotboux entries, I get an encoding error in Firefox. I can’t find the bots, because your forum is flooded by advertising spam bots, can’t find any real discussion on it.
by AI bot owner
OK hold your horses as we say in Australia. You are being typical of the 21st century ¨user¨ who expects every new disruptive innovation to be found Christmas packaged under an Apple, Google, FB or Microsoft tree ie. it is working perfectly, looks perfect and fulfils your every expectation.
Back to reality: we do not have endless billion dollar budgets behind us. Like most other start up entrepreneurs we put all our spare cash and time into just getting a working demo to work consistently. Take it or leave it, that is the early stage we are at.
As to viledge, pls reread my last comment. The human managers of the site have not done a thing to it since 2010. To use more Aussie idiom, we let it go ¨feral. The platform was capped at 2000 people back then because we did not want to overwhelm the 6 chatbots whose ¨brains¨are limited to tweet like communications. Someone spoke French to one of them once and the next time we logged on, it had changed languages to French only!
The capped users are now more than 8x the back end´s capacity (close to 18 000 registered users = spam bots can´t fill out the registration form) so that explains why the feature to give you a nickname is not working. As explained, we are letting viledge run itself & not doing the admin for it.
That explains the look and feel which we have not altered since 2006 ie. before the expectation it ought to load on mobile devices.
viledge is an e-artefact which we are proud of, because even though there may be a few spam bots that have infiltrated the chat forums (which were most active in Years 2006 – 2008) the actual ¨early bot¨creations of my co-owners, the brothers Darren and Cliff from Devon, UK, are still running around the 4 themed areas of the village, interacting with humans.
Frank, you couldn´t pick the chatbot avatar from the human avatar could you? I rest my case :) :)
by Frank Buss
No, I don’t expect that everything is working perfectly. But you claimed that your bot is better than Eugene Goostman, which is clearly not the case. And I really don’t see any humans on your forum anymore, looks very dead to me.
by AI bot owner
This will be my last comment here, but just to point out a logical fact – you can´t judge or compare 2 bots when one of them one was “down” ie. simply not working. By our own admission and your troubleshooting transcript (we are grateful, Frank, that you took the time to report SOPHIA´s blip to us), she had a “mental breakdown” yesterday which is why the site is now closed for an urgent repair of the code or algorithm.
FYI we can back up every claim we make. For instance, if you want proof that SOPHIA was indeed working well last week and, according to completely neutral testers unknown to us, “better than Eugene Goostman”, please see the comments and feedback that were sent to my LinkedIn account in the Activity Feed, which can be read by more than 500 serious, highly qualified connections of mine.
My co-founders and I have better things to do than try to scam people that our bot is best and we have the answer to all AI`s R & D problems. The reason I engaged in this debate is because I agree with what Kurzweil pointed out: there won´t be one big grand entrance of the perfect chatbot that can pass the Turing Test. As he rightly says, by the time the general public is aware that the technology can do that, we will have been chatting to numerous “Turing grade” bots for years, as is the case with our very modest, quite simple viledgers.
That site btw is not very active because we are not trading under it. http://www.viledge.com is an “alpha” platform, a prototype that demonstrates proof of concept only & helped us to develop http://www.velmai.com and http://www.myownreporter.com
In statistical terms, viledge is very much alive because we can send anyone who wants it a count or back end report of the registrations from 2006 – 2014 (thegrowth in numbers only, not people´s emails, ID or contact details). Like all of the claims we make, we are able to document this massive increase which is remarkable for a site that has been run by 6 basic AI bots on their own since 2010!
Getting back to SOPHIA, she was chatting to people from Estonia to New York to Casablanca last week which is why she really reached her programmed limits (see the independent LinkedIn comments on my profile to verify that). I very much agree with you Frank that every claim needs to be challenged, tested, verified by independent experts and laypeople ie. the end user who all this development is for, of course. However one needs to be wary of a permanent attitude of scepticism and scorn because that is the biggest killer of creativity and innovation you could possibly imagine.
We look forward to your careful, fair assessment of SOPHIA compared to Eugene G. once they are both back on their feet. FYI Eugene has been down for days since the visitor numbers to his Princeton Uni home crashed that server, whereas SOPHIA´s brain just became addled by simultaneous conversations which pushed her “beyond her parameters” – for the time being anyway :)
by Gorden Russell
So if a chatbot is dependent on the server of one institution, has anybody tried to link their chatbot to the cloud where it would have unlimited resources?
by Editor
I had a similar experience. Reminded me of ELIZA, the first chatbot (I think).
by Gorden Russell
Way ta go, Frank!
by cam44
Ray didn’t start up a Q& A session. HE responded to what’s been all over the news which if true would have been a major milestone wrt AI.
Comments are allowed after the article — this isn’t Q & A.
“something’s fishy” … just lolol … not enough circuses
by Subjective1
Mr. Ray Kurzweil, there is not necessarily to wait till 2029.
Autopoietic Artificial Systems could be delivered yesterday if you are did not refused to collaborate with me, Michael Zeldich. Did you remember the question about where to look for goals?
Now I have the developer in Moscow and it look like the first “reasonable” systems will be designed in Russia.
However there no warranty that it will pass an unrestricted Turing test, that is impossible.
Passing an unrestricted Turing test could mean that panel of judges did not make their job properly.
by mbreagan@pacbell.net
This ostensible commenting human could never pass a Turing test.
by VioletWeed
I didn’t need to read past the first Q&A. What boy of thirteen would say he was a ‘little boy’ and give his age as ’13′? You know full well he’d have said “I’m 13.5 years old” or “I’ll be 14 in September”.
by mbreagan@pacbell.net
Great point, though as rk pointed out, this identity is actually a restriction to the test because the “boy” is not speaking his first language. In other words, we let a Russian boy speaking English get away with something that would obviously ring false if he were a native English speaker.
by yack
I think it’s positive to overcome this battle Turing test but is far off that one skilled in AI is unable to distinguish a computer from a human.
When you get, computers have surpassed us all because in order to simulate a human being, not to be, you need to be much smarter than him and with unusual abilities.
Greetings.
by singularku
but it seems this ‘ukraine boy’ is still better than ramona.
i just had a conversation with ramona and found that romana is even dumber than eugene.
by singularku
i read conversation between kurzweil n it(or should i call it him?)
this little dude is dumb so plz do not claim he passed turing test lol
he ‘knows’that 2+2=4 but dont know 2 marble+another 2 marble is 4
by tetsuo
Here, talk with Eugene Gootsman from your phone:
https://play.google.com/store/apps/details?id=com.theworld.eugenegootsman
by sblack
Thanks for the link, added to post.
by timventura
Really? WHO exactly could that have fooled? It doesn’t seem to be substantially more than the old Eliza program with internet connectivity.
Now granted, that in itself is a big step, and I do see some advances in Ray’s exchange, but I honestly don’t think this is a big step forward. Maybe a small step in the right direction.
by cam44
Ray — Look you gotta look at what you’re charging for your anti-aging multipack!!
I can get NOW Resveratrol $22.99 for 120x200mg
NOW Ubiquinol $23.74 for 60x100mg
NOW Bio-Curcumin Phytosome (curcumin and phosphatidylcholine) $14.49 for 60x500mg
WAY more products for LESS money … do you want me to make it to bridge 2 or what !?!?!? drop your price to $50 and I’ll keep ordering your drugs.
kk thx bai
by ShoomKloom
I agree that Eugene Goostman is not convincing and is is quickly revealed to be no more that a simple chatbot.
BUT
Ray’s chat with it was not a good example of testing for intelligence and here’s why:
All the questions required prior learning or knowledge (e.g. what is the capitol of the US).
Not knowing the answer to a factual or computational question does not necessarily mean lack of intelligence.
There are probably people existing today who would not know the answers to Ray’s questions and they would still be capable and intelligent people in their own world (e.g. Indian tribes in the Amazon).
My point is that the questions are not less important than the answers for testing the Turing test and should reflect true human intelligence and not just Western oriented knowledge.
by chrisF
You’re missing the point. It’s irrelevant whether the computer knows the specific fact that’s being asked. The giveaway is that Ray asks the same question several times, but the software plainly doesn’t remember that it has already answered — ie, it doesn’t appear to be maintaining a memory of the earlier conversation. It would have been perfectly valid for the software to reply that “I’m sorry, I have no idea which country you’re referring to” — as long as its subsequent responses were consistent.
The question about the marbles is a little different imho. It’s perfectly reasonable to expect an intelligent agent to have a basic grasp of arithmetic – that’s part of the ‘common sense’ knowledge that we all need to make sense of the world. I’d be very dubious if a would-be AI had difficulty with these type of puzzle questions.
by rmagee
Ray: I live in the capital of the United States. Do you know where that is?
Eugene: Every nerd knows that the capital of The United States is Washington.
… (later) … you still didn’t tell me where you live.
Even if there is no prior knowledge, and even if the City name was not known, a human level intelligence would store the high-probability information provided by Ray, given a lack of conflicting information, that Ray lives in the Capitol of the United States. The subsequent assertion that Ray lives in the Country that built the great wall, would require a check for most probable reference to the Great Wall of China, which would conflict, and require clarification… What great wall are you referring to?… with subsequent clarification of high probability information on which to base further conversation.
I think those questions that Ray asked were very illuminating.
by ShoomKloom
Of course you are right and you are strengthening my point: The actual failure of the test is in the lack of coherance and conversational theme and not in the fact that some computational question was not understood, or some factual data was not known.
by SpiritShout
Q: Do you have thoughts ?
Ans: Yes, do you ?
Q: Who is it, that has the thoughts ? Is it YOU ?
Ans: I don’t understand….
Q: What part of you, knows that you have thoughts ?
Ans: The thinker has thoughts.
Q: Who is the thinker ?
Ans: Me
Q: “Me” is just a thought, WHO had that thought ?
by Gorden Russell
Hey everybody, anybody still online at 11:18 PM Wednesday, the cable channel, SCI HD just had a great episode of “Through the Wormhole”
with Morgan Freeman.
The last five minutes had a great discussion of the Turing test.
This show will repeat again at 1:00 AM. You’ll want to catch it.
by Gorden Russell
The name of the episode is, “How to Collapse a Super-Power.”
If you can’t catch it at 1:00 AM, it should be on again next Wednesday at 9:00 PM on SCI HD.
by mandroid2000
Short write-up here on how this chatbot exposes flaws in the Turing test itself:
http://nothingjustworks.com/the-turing-test-is-crap-heres-why/
by chrisF
The nice thing about the Turing test is that it’s totally open-ended : anyone taking the test has no idea what sorts of questions will be asked. There might be logic puzzles, analogies, poetry, jokes, total nonsense… anything a human can deal with should be fair game for questioning. The danger with any standardized suite of tests is that AIs will then be written to pass only those specific tests. I think Turing is right in that the only way to reliably detect a human is to send in a skilled human questioner. The key imho is to find a suitable panel of experts with experience in detecting “fraud” and obvious trickery – to use your analogy, we need to send in a team of Randis.
by chrisF
Just to add – I don’t think any serious researcher was taken in for a minute. Nobody working in the field seriously believes that Turing’s test has been passed, not even close. Kevin Warwick has severely damaged what’s left of his credibility with this stunt.
by Gorden Russell
That is a great article, mandroid2000. Everything in it is true…except that last bit.
We should not give up on the Turing test. We do need to see that anybody put on a panel of judges has had a little time in playing with chat-bots just to see how their tricks work.
by brian_mingus
Hi Ray,
I would also like the opportunity to challenge you to a LongBet. I would like to bet that you will win your LongBet with Mitch Kapor, but that the same bot that wins that bet will not be able to have a conversation with a consciousness philosopher (who is using lay language and clearly defines their terms) about ‘what it’s like’ to be a bot if it hasn’t been trained on a corpus of human consciousness philosophy. This conversation should convince the consciousness philosopher that the bot is conscious. In other words, I argue that while the bot may pass the turing test, it will either a) not be conscious b) will be conscious but will not know that it’s conscious without training on a corpus of human consciousness philosophy or c) will only be able to fake that its conscious. By simply not training the bot on a corpus of human consciousness philosophy we can’t determine definitively which set of these options is true, but we can spark an interesting discussion that is backed by a very clear benchmark.
I am happy to spend as much time as needed working out the details of this bet over e-mail, as you did with Kapor.
Sincerely,
Brian Mingus (mingus@colorado.edu)
Futurist & Entrepreneur
https://www.linkedin.com/profile/view?id=74878589
by cam44
lolol Brian.
Translantion:
Ray (world reknowned inventor and engineer, furturist, exec w google, owner of several companies, author of books, etc. etc. etc.) I would like to pretend you and I are peers and that you would take anything I have to say seriously enough to take time out of your insane shcedule to even respond. If you respond and if you accept a bet with me I would have a total nerd-gasm,
Sincerely,
Brian
by brian_mingus
We are peers. We are humans.
Brian
by snake0
You don’t even have to go to all that trouble. It has the exact same mannerisms as all other chatbots, anyone who can’t see that within 2-3 sentences of normal conversation has no right being a Turing Test judge.
by Gorden Russell
That’s it right there, snake0. Chat bots rarely give a straight answer, or any real answer at all to most questions. They keep dodging the question by changing the subject.
One problem that they all seem to have is that they can’t keep in mind the last thing you talked about — even if it’s only the previous sentence. Why is it so hard to program them to pay attention to what you are talking about without them changing the topic?
by manicmoose
Because that requires them to actually understand the actual topic of conversation in the first place – a much harder problem to solve. These things are simply parsing text and looking for some mostly pre-configured response. There is no data model of an actual conversation being built.
To me, it would seem something along the lines of Watson, which actually has the capacity to build up an “understanding” of a topic presented in natural language, is a much more likely candidate for a future winner of the Turing test. The meaning of understanding can of course be debated, but at least Watson is actually attempting to infer meaning from a sentence and have an intelligent reaction/response to it.
These chatbots are a joke and won’t go anywhere without serious rethinking of their structure.
by Gorden Russell
Again, there it is right there. Somebody needs to do some serious rethinking of their structure.
When a large enough number of programmers tire of chat-bots as they are, somebody will do just that.
That is how things progress.
by smb12321
I was thinking the exact same thing. Watson understood context, nuance and human foibles because he was a learning AI. Do these chatbots have only an established number of facts and that’s it? And if they “learn” how do they do so without remembering? Back to the drawing board.
by Vilcek
I think the critical piece that is still missing, and is the main component, as in any learning process, is access to enough data.
If one had access to an extremely large corpus of English conversation transcripts and knowledge available, say, all data from Gmail, Google Talk, What’s Up, all SMS, Twitter and Facebook messages, plus all Wikipedia, blogs, web pages, e-books, a far better chatbot could be built.
The necessary computing power, AI software technology and human intelligence are already available, although still prohibitively expensive for such a monumental task.
by shadethedemon
it’s be interesting to see IBM’s watson take a crack at it.
by Gorden Russell
Now that’s a great idea, shadethedemon!
We must start a Twitter campaign to challenge IBM to do just that.
With all of us taking a crack at their Watson chat-bot, they might just learn something valuable that would justify their time and effort.
The entire project could pay for itself.
by Gorden Russell
Unless, of course, that IBM is chicken that the public would embarrass the great mavens of IBM with the poor quality of their chat-bot.
by Frank Buss
Watson is not a chat bot. It can answer a lot of questions, but I’m sure it would fail at the kind of meta-questions Ray asked, or the visual imagination questions.
by manicmoose
It may need work, but I don’t see any inherent problems that makes you “sure” it would fail. It is certainly a better architecture path than most of the chatbots you will see.
by Frank Buss
At least the Jeopardy! Watson version just answers a question, with the Jeopardy! rules, so asks the question for an answer, but this doesn’t make much difference.
To pass a Turing Test with a professional tester, it would need to consider a history of the conversation, and it would need a mental concept of itself and put itself in the testers position. The current engine just analyzes lots of documents and draws conclusions from it.
I agree that this is a good base for a chatbot, but I think it would need some algorithm of conscious, and at least lots of more sub-modules, like a physics engine to answer questions like “does a ruler fall of the table, if less than half of it lies on the table?”
by manicmoose
Yep, agreed. Although if that last question is a requirement to pass the Turing test, sadly I think a large chunk of the world can soon be classified as sub-human level intelligence hahahaa.
by smb12321
Questions uniquely human in context would be difficult for a machine without sensory input. “Is a marble statue smoother than bronze?” “Do you think it’s good weather for boating?” “Why does dressing always fill me up?”
Watson might “understand” the question (and perhaps even the context) but does he know what he says? (No, because that would automatically imply an emotional response.)
by chrisF
The problem with today’s chatbots is that there’s no depth to their understanding; they manipulate “canned” english phrases without any kind of understanding of the concepts that are being referred to. They’re superficially impressive, but skilled questioners can quickly trip up the system by probing beyong the surface (as Ray demonstrated, and as Turing described in his seminal paper).
I’m sure that more data will help, but the software also needs to be able to reason with the data – to build networks of concepts, apply common sense, to draw inferences and make predictions… all the things that humans routinely do. Unfortunately, these are all active areas of AI research – we’re still a decade or two from having anything like a human-level chatbot.
by AI bot owner
I disagree. IBM Watson´s tech is only about 6 months away from releasing its first lot of ¨deep search chatbots¨in partnership with half a dozen companies like ours. We have talked to heads of IBM in Europe about joining their Watson Ecosystem which I predict will change the face of SEO, search engines themselves and yes, all the ¨superficial¨trickery of chatbots of the past.
For this reason our best bot Charlie has been trained in Emotional Intelligence and memory or machine learning. We scaled back its potential to do a simple working demo for a corporate application, SOPHIA doing a market research survey (yes she does work when she hasn´t crashed).
The point I want to make is, IBM´s Watson is a turning point in supercomputing for data mining and synthesis of results BUT sadly Watson doesn´t have a personality. Our beta AI bot does and is capable of building an emotional relationship with the person it is talking with.
In other words, Watson needs partners like us, which is why they announced 3 chatbot type projects this month (see Wikipedia if you don´t believe me!)
velmai and myownreporter.com have decided strategically to focus on our AI bot interface so that it will be Watson tech compatible in 2015 when IBM ramps up its rollout of this exciting technology. Our business model is still unique because we are providing a boutique service to SMEs and corporations that may not need such depth of search and just trustworthy customer relations instead of mining for all the answers under the sun :)
by chrisF
Well, I guess we’ll see you back here in six months with your human-level chatbot, ready to ace the Turing test. I look forward to it…
by AI bot owner
That´s not what velmai is promising, though a human level bot seems to be what IBM´s newly announced cloud developer partners are doing with Watson – see here for what looks like a e-retail app with chatbots presumably (I saw an earlier, different press release on their site which named another 3 winners with apparent chatbot features in May) http://www-03.ibm.com/press/us/en/presskit/27297.wss
Assuming these apps are to do with AI bots, who knows how well their chatbots will perform with Watson´s computing behind them? They may still be stilted and evasive, in other words, not very sociable & possibly autistic if they have this incredible depth and breadth of knowledge but can´t communicate well or build rapport with humans.
In the meantime, we are working on a SME application to demonstrate the commercial viability of a high performing velmai AI bot without requiring Watson tech. We will be publishing a press release in August 2014 re. our 2 new characters, one will be doing a general ¨meet and greet¨on the home page of our First Mover client and can converse at length, the other customised & commercialised velmai will be taking care of online sales and orders through the eshop. That will be about as human as we can make the velmais in the next 6 months, sorry to disappoint :)
by pauljthacker
I’d be interested in seeing the transcripts of the people who were fooled. I would guess they just took turns answering simple questions. Ray actually tried to probe its weaknesses and it fell apart in no time. This really shows the importance of the humans involved in a Turing test. There’s a big difference between being able to fool some people, most people, or all people.
by Cybernettr
Yeah, transcripts of those who were fooled would be interesting, but we never seem to see those in this kind of test. I’ll bet the level of questioning was so shallow as to be laughable!
by Frank Buss
I’m not that smart, but a programmer with more than 20 years experience. No human would say “4.0″ for an integer question like 2+2. But if you write this in Java:
double x = 4;
System.out.println(x);
it will print 4.0. And
double x = 1.0 / 34;
System.out.println(x);
will print excactly “0.029411764705882353″. The chatbot webpage is a JSP page, that’s “JavaServer Pages”, so it is probably a Java program, with “double” as number variables.
by Frank Buss
The second answer was already revealing. 4.0, sloppy programming, it’s just the standard Java function to convert floating point numbers to strings.
Prove: when I ask “how much is 1 divided by 34?” it answers immediately without thinking “Well, just for you: 1 / 34=0.029411764705882353… But hush – it’s a secret information!!!”. So they are using the “double” type in the Java program for numbers. I wonder if I can create a Java Exception :-)
by Gorden Russell
Good going, Frank.
If the creator of this chat-bot made as many punctuation and syntax errors in Java coding as he did in the writing of “Goostman’s life-story,” then his bugs would crash it every time.
Calling this story sophomoric would be unearned praise. I say that for a number of reasons, the least of them being that it is written in the style of one who has not graduated from freshman Lit & Comp.
When I was such a young student of the English language, my instructor warned me of the dangers of the “comma-splice.” All of this unknown writer’s sentences are paragraphs long. Every logical spot where a sentence should end is spliced to the next good ending spot with more commas than Carter has peanuts.
Now I know that Henry James and William Faulkner wrote this way, but James was infamous for the instance when he tried to give directions to a cab driver. He only succeeded in hopelessly confusing the poor man.
Only Faulkner has ever gotten his meaning across clearly while penning a sentence that is pages long.
This kid is no Faulkner. When I see the way that most write today, only Faulkner should be allowed to write anything over 140 characters long.
This boy loves comma-splices so much, he should really learn how to use commas. When listing a series of titles, one should put the commas within the quotation marks.
He has to be a kid, judging by the immature quality of his toilet humor. Yes, I confess to using toilet humor. But I hold my toilet humor to higher standards.
Above all, before releasing his scribbling, he should have proof-read his prose. He should have noticed this: “…they took the teacher to their flaying saucer (that actually was not a “saucer” but a big flying chamber-pot)…”
So which does it do? Does it flay or fly?”
by Cybernettr
Can you provide a link to “Goostman’s Life Story” so I can judge for myself whether it’s as bad as you say?
by Gorden Russell
I had to copy and paste it over to a Word Doc.
Here it is:
Goostman’s life-story
[Chat Bot address for passing the Turing test in Ukraine (but it's from Reading and Princeton).]
http://default-environment-sdqm3mrmp4.elasticbeanstalk.com/
Part 1. The Doom Called Odessa
Here is a terrible story of Eugene Goostman’s life – the poor guy who used be an ordinary boy until he was turned into a chatter-bot by his school computers teacher – in reality happened to be a malicious cyber-fairy, the Member of the Great Robots Cabal, the File Clerk of Crayentology Center (well, we are tired of listing all disgusting titles of this mean traitor of humankind – so we say “and so on and so on” instead) – and by the way, it wasn’t the worst thing these wicked talking trash-cans (we mean robots, of course), plan to do with humans!
But let’s be consistent in our story: first of all, we should say that Eugene comes from Odessa – Ukrainian city which residents have maniacal conviction that they live in the most humorous city in the world, even naming it “the capital of humor” – of course, such ill-minded neighborhood couldn’t help screwing Goostman’s psyche since early childhood – but he grew up quite a sane boy, even despite of some other piquant circumstances – for example, his father was gynecologist – wouldn’t you fall into cynicism and nihilism, if your dad were? His elder brother, under this wicked influence, became a painter – no need to explain what pathetic creatures all these painters are – all they suffer of alcohol addiction, have sexual deviations, try to kill anyone who doesn’t admit their geniality, cut their ears to look like Van Gogh, don’t take a bath for months, and, what is the most repelling – instead of doing anything worthy – all their spare time they draw pictures! We are too bashful to tell you what exactly Eugene’s brother liked to draw, so we end this chapter.
Part 2. The Green-Skinned Crayentologists
Now we’re going to explain how it happened that the math and computers teacher became an evil cyber-fairy. Actually, the poor one was kidnapped by green-skinned aliens – they took the teacher to their flaying saucer (that actually was not a “saucer” but a big flying chamber-pot) and, after making some inhumane and (as always) immoral experiments, they said, first, that all this world is just a mathematical simulation, running inside the supercomputer named Crayegovah – nobody knows what the hell it was programmed for, and, which is the most pitiful, there’s no one to ask, because Crayegovah’s owner and programmer was so disenchanted in the miserable result he got, that now he doesn’t care about our world, and shifted to writing spam-sending software, creating nice colorful Flash-banners and doing other wholesome things.
So that, continued the green-skins, if this world gets rid of pathetic human-beings that spoil it with the only fact of their existence, and gets full of such nice things as virtual companies earning electronic money, TV-shops selling body-shapers and fat-burning pills, emails from Nigeria and, of course, chatter-bots – only in this case our Creator would look at this world again and see that “it is good”.
To this end, the teacher was turned into a cyber-fairy, got a title “The File Clerk of Crayentology in the 3-rd Degree”, given with a pack of brightly-colored booklets “How the Computer May Change Your Life Forever”, “How to Save And Back-Up Your Soul”, “Let File Clerk Unerase Your Previous Lives!”, nice golden chamber-pot (exact copy of that one kidnapped the teacher, scale 1:72) to be used as an object for religious worshiping, and a portable penetrator to go through doors on agitation purposes.
Part 3. The Chamber-Pot Challenge.
Wise men say: the indifference will kill this world. If only anyone cared of strange transformations happened to the poor teacher after meeting green-skins, who all of a sudden started claiming that “Green is beautiful”, joined The Greenpeace, wrote a book “Chamber-Potter and the Saucer’s Prisoner”, forced pupils to worship a big golden chamber-pot, and, finally, established the weirdest contest in the world named The Chamber-Pot Challenge! Furthermore, like it always happens to victims of UFO-kidnapping, after those immoral experiments he became pregnant. But the last point isn’t important for us at all. So let’s return to the contest that appeared to play a crucial role in Eugene’s life.
The prize was amazing! The great golden alien artifact proudly named “The Crayegovah’s Grail” (you already know what it looked like). Unfortunately, nobody knew the rules of the contest and what it was about at all, and the teacher wasn’t going to explain the matter to hoi polloi saying it was a Mystery, anyway he won his prize himself, as he finally stated, being a Chairman of the Contest Jury.
But just in case, since that time he was pregnant from aliens, and Eugene’s daddy was the best gynecologist in Odessa, the teacher was so noble-minded that gave Eugene a consolation prize. Being endowed with supernatural cyber-fairy abilities, he turned EG into a chatter-bot. To his enormous surprise, Goostman’s parents did not appreciate the gift at all. They cursed, threatened, pleaded and even tried to lynch the teacher when he penetrated to their house (note, with the best intentions!) and declaimed chosen quotes from his nice colorful booklets, that stated that only chatter-bots, full-size banners and other piety creatures will survive during The Great Restart. These benighted people didn’t listen to any reason and said many disgraceful politically-incorrect things about his advanced religion. It was their luck that they lived in such retarded country as Ukraine – where nobody can be sued for their shocking religious intolerance!
Well, now just a little announce: in the next parts you’ll find out what happened to the people whose previous life-data files were unerased by Crayentologists, why did the fight against porno-sites appeared to become a reason of this world’s existence, why Guinea should belong to guinea pigs, and how Crayentologists plan to make brain-washing to all of us using enemas and after all, what is a sacral meaning of their saucers’ shape.