author |
Ray Kurzweil

No ideas but in things.

William Carlos Williams

As discussed in several of the contributed articles in this book, the Turing test was devised by Alan Turing as a way of certifying machine intelligence. Turing described a situation in which a human judge communicates with both a computer and a human using a computer terminal. The judge’s task is to determine which is which. The judge cannot see the computer or the human and must make his or her determination by interviewing both. The computer attempts to trick the judge into selecting it as the human.

The essence of the Turing Test is that the computer attempts to act like a human within the context of an interview over terminal lines. A narrower concept of a Turing test is for a computer to successfully imitate a human within a particular domain of human intelligence. We might call these domain-specific Turing tests. One such domain-specific Turing test, based on a computer’s ability to write poetry, is presented here.

The Kurzweil Cybernetic Poet is a computer program (written by the author) and provided with an input file of poems written by a human author or authors. The program analyzes these poems and creates a word-sequence model based on the poems it has just read. It then writes original stanzas of poetry using the model it has created. Some of the following stanzas of poetry were written by the Kurzweil Cybernetic Poet. Some were written by human authors (in fact the same human authors that were read and analyzed by the Kurzweil Cybernetic Poet). See if you can tell which are which. On a piece of paper, write down the numbers 1 through 28. Then put a C by that number if you believe that the corresponding stanza was written by the computer. Put an H if you believe the stanza was written by a human poet. The answers are in a footnote.1 Following the answers are the results obtained when the test was taken by 16 human (both adult and child) judges.

1. is beauty itself

that they were walking there. All along the new world naked,

cold, familiar wind-

2. Pink confused with white

flowers and flowers reversed

take and spill the shaded flame

darting it back

into the lamp’s horn

3. The winds of the oozy woods which wear

the ocean, with azure moss and flowers

So sweet, the purple even

I sleep in the arrows

Of the dome of death.

4. 0 thou,

Who moved among some fierce Maenad, even among noise

and blue

Between the bones sang, scattered and the silent seas.

5. She eyes me with an ingrown eye,

in the rhythm of teacup tapping

thinks of sweeping away crumbs

6. At six I cannot pray:

Pray for lovers,

through narrow streets

And pray to fly

But the Virgin in their dark wintry bed

7. What seas what shores what granite islands toward my timbers

And woodthrush calling through the fog

My daughter.

8. Imagine now a tree in white sails still whirled

will be of silences

Calm and angels

9. -and the sun, dipping into the avenues

streaking the tops of

the irregular red houselets,

and

the gay shadows dropping and dropping.

a perfect if slightly paled

old park turned with young women

seized in amber.

11. “Interesting book?”

she sits

dancing by the electric typewriter,

bloodless revolution of meats

strings of use,

Politic, cautious, and the fact

she is the fact

she is calling them all-

The children at his feet

he is always time

To roll it was dark,

damp, jagged, like the voice

Because of love ends.

12. Men with picked voices chant the names

of cities in a huge gallery: promises

that pull through descending stairways

to a deep rumbling.

13. Where were thou, sad Hour, selected from whose race is

guiding me,

Lured by the love of Autumn’s being,

Thou, from heaven is gone, where was lorn Urania

When rocked to fly with thee in her clarion o’er the arms of death.

Thou, from the day, having to care

Teach us now thoroughly small and create,

And then presume?

And this, and me,

And place of the unspoken word, the unread vision in Baiae’s bay,

And the posterity of Michelangelo.

15. I am lonely, lonely.

she hides deep within her

yet plays-

Milkless

16. 0 my shoulders, flanks, buttocks

against trespassers,

against thieves,

storms, sun, fire,

against thieves,

storms, sun, fire,

against flies, against weeds, storm-tides,

neighbors, weasels that waken

The silent seas.

17. the days, locked in each other’s arms,

seem still

so that squirrels and colored birds

the branches and through the air.

18. I am watching ants dig tunnels and bury themselves

they go without water or love

perhaps vomiting,

perhaps laboring

to the usual reign

20. Rain is sweet, brown hair;

Distraction, music in passageways.

Six o’clock.

The time. Redeem

The world and waking, wearing

The worlds revolve like ancient women

Gathering fuel in vacant lots.

22. I should have been a pair of ragged claws

Scuttling across the floors of silent seas.

23. patches of all

save beauty

the rigid wheeltracks.

The round sun

the bed.

She smiles, Yes

then stays

with herself alone

and then dividing and over

and splashed and after you are

listening in her eyes

24. All along the road the reddish

purplish, forked, upstanding, twiggy

stuff of bushes and small trees

with dead, brown leaves under them

leafless vines-

25. Pray for those who are branches on forever

26. Like a sod of war;

houses of small

white curtains-

Smell of shimmering

ash white,

an axe

27. By action or by suffering, and whose hour

Was drained to its last sand in weal or woe,

So that the trunk survived both fruit or flower;-

clouds and ash and waning

sending out

young people,

The above 28-question poetic Turing test was administered to 16 human judges with varying degrees of computer and poetry experience and knowledge. The 13 adult judges scored an average of 59 percent correct in identifying the computer poem stanzas, 68 percent correct in identifying the human poem stanzas, and 63 percent correct overall. The three child judges scored an average of 52 percent correct in identifying the computer poem stanzas, 42 percent correct in identifying the human poem stanzas, and 48 percent correct overall.

The charts show the actual scores obtained by the 16 human judges as broken down by adult/child, computer experience, and poetry experience. As can be seen from the charts, there were no trends based on level of computer experience or poetry experience clearly discernible from this limited sample. The adults did score somewhat better than the children. The children scored essentially at chance level (approximately 50 percent) and the adults achieving slightly better than chance.

The next chart shows the number of correct and incorrect answers for each of the 28 poems or stanzas. While the adult judges scored somewhat better than chance (63 percent), their answers were far from perfect. The computer poet was able to trick the human judges much of the time. Some of the computer poems (numbers 15 and 28, for example) were particularly successful in tricking the judges.

We can conclude that this domain-specific Turing test has achieved some level of success in tricking human judges in its poetry-writing ability. A more difficult problem than writing stanzas of poetry is writing complete poems that make thematic, syntactic, and poetic sense across multiple stanzas. A future version of the Kurzweil Cybernetic Poet is contemplated that attempts this more difficult task. To be successful, the models created by the Cybernetic Poet will require a richer understanding of the syntactic and poetic function of each word.

Even the originally proposed Turing test involving terminal interviews is notably imprecise in determining when the computer has been successful in imitating a human. How many judges need to be fooled? At what score do we consider the human judges to have been fooled? How sophisticated do the judges need to be? How sophisticated (or unsophisticated) does the human foil need to be? How much time do the judges have to make their determination? These are but a few of the many questions surrounding the Turing test. (The article “A Coffeehouse Conversation on the Turing Test” by Douglas Hofstadter in chapter 2 provides an entertaining discussion of some of these issues). It is clear that the era of computers passing the Turing test will not happen suddenly. Once computers start to arguably pass the Turing test, the validity of the tests and the testing procedures will undoubtedly be debated. The same can be said for the narrower domain-specific Turing tests.

Adult scores on poem stanzas composed by a computer
(13 adults, % correct) Level of Computer experience
Level of poetry experience
Little Moderate Professional Average
Little 56 44, 69, 75 63, 75 64
Moderate 50, 56, 63 56, 63 75 61
A lot 25 25
Average 56 61 59 59
Children’s scores on poem stanzas composed by a computer
(3 children, % correct)
Scores 38, 50, 69
Average 52
Adult scores on poem stanzas composed by a human
(13 adults, % correct) Level of computer experience
Little Moderate Professional Average
Little  83 58, 58, 100 50, 67 69
Moderate  60, 67, 83 58, 83 92 74
A lot 25 25
Average 73 72 59 68
Children’s scores on poem stanzas composed by a human
(3 children, % correct)
Scores 33, 42, 50
Average 42
Adult scores on all poem stanzas
(13 adults, % correct) Level of computer experience
Level of poetry experience Little Moderate Professional Average
Little 68 50, 64, 86 57, 71 66
Moderate 55, 61, 71 61, 68 82 66
A lot 25 25
Average 64 66 59 63
Children’s scores on all poem stanzas
(3 children, % correct)
Scores 39, 43, 61
Average 48
 Numbers of right and wrong answers for each poem stanza Computer or human poem stanza Poem stanza No. right No. wrong 1 9 7 computer 3 11 5 computer 4 8 8 computer 6 9 7 computer 8 11 5 computer 11 11 5 computer 13 8 8 computer 14 10 6 computer 15 6 10 computer 16 10 6 computer 19 9 7 computer 20 12 4 computer 23 9 7 computer 25 8 8 computer 26 11 5 computer 28 6 10 computer Average 58% 42% 2 9 7 human 5 9 7 human 7 9 7 human 9 13 3 human 10 8 8 human 12 10 6 human 17 9 7 human 18 14 2 human 21 11 5 human 22 11 5 human 24 11 5 human 27 8 8 human Average 64% 36% Overall average 61% 39%

We have not yet reached the point at which computers can even arguably pass the originally proposed terminal-interview type of Turing test. This test requires a computer to master too many high-level cognitive skills in a single system for the computer of today to succeed. As Dan Dennett points out in his article, the unadulterated Turing test is far more difficult for a computer to pass than any more restricted version. We have, however, reached the point where computers can successfully imitate human performance within narrowly focused areas of human expertise. Expert systems, for example, are able to replicate the decision-making ability of human professionals within an expanding set of human disciplines. In at least one controlled trial, human chess experts were unable to distinguish the chess-playing style of more sophisticated computer chess players from that of humans. Indeed, computer chess programs are now able to defeat almost all human players, with the exception of a small and diminishing number of senior chess masters. Music composed by computer is becoming increasingly successful in passing the Turing test of believability. The era of computer success in a wide range of domain-specific Turing tests is arriving.

# Note

1. Four human poets were used: three famous poets (Percy Bysshe Shelley, T.S. Eliot, and William Carlos Williams) and one obscure poet (Raymond Kurzweil). In the case of the famous human poets, stanzas were selected from their most famous published work. In all cases, the stanzas selected did not require adjacent stanzas to make thematic or syntactic sense. The computer stanzas were written by the Kurzweil Cybernetic Poet after it had read poems written by these same human authors. The answers are as follows:

