The End of Handicaps, Part 2
August 6, 2001 by Ray Kurzweil
How technology has and will continue to assist the disabled, written for “The Futurecast,” a monthly column in the Library Journal.
Originally published May 1992. Published on KurzweilAI.net August 6, 2001.
Last month’s Futurecast (LJ, April 15, p. 68, 70) discussed how advancing pattern recognition technology, aided by the ongoing semiconductor revolution, is rapidly ameliorating the handicaps associated with the disability of blindness. It also pointed out that in certain ways the communication barriers presented by hearing impairment have been more difficult to overcome than those for visual impairment. Again, the primary handicap has to do with language–in this case, the lack of access to spoken language.
There have been three approaches to overcoming this difficulty, each of which has made an important contribution but suffers from severe limitations. Lipreading–understanding verbal language by observing the movement of someone’s lips–is very difficult, and few deaf persons are able to accomplish this effectively. Moreover, it is not feasible from a distance, rendering it useless at most meetings and certainly impossible on the phone. Sign language is a rich language that can communicate verbal thoughts with the same speed and efficacy as spoken language. The disadvantage is obvious: very few people outside of the deaf community (which includes professionals who work with deaf persons) can “speak” or understand it.
Finally, there is TDD (telecommunications device for the deaf), which enables deaf people to “talk” on he phone by typing on a typewriter-layout keyboard and reading messages from the other party on a small screen. The limitations here also are significant. The speed of communication is limited to the slow speed of typing, and a deaf person can only hold such interchanges with others who have a compatible device. The result of all of these problems is that hearing-impaired paired persons are often cut off from basic communication.
An effective approach to this problem will be the listening machine, a sensory aid that will convert spoken language in real-time to a visual display of what someone is saying. As is turns out, early versions of this technology are not far off. What is required is the Holy Grail of speech recognition, combining the ability to recognize a large vocabulary (of tens of thousands of words), the ability to recognize speakers that the system has not been trained on (a capability called speaker independence), and the ability to handle continuous speech (as opposed to discrete speech, which consists of spoken words separated by brief pauses). Today commercial speech-recognition systems can provide two of these three capabilities: large vocabularies and speaker independence but only on discrete speech. There are continuous speech systems but most are not fully speaker independent, and there are limitations on vocabulary size.
Fortunately, the accuracy of a speech-recognition system for this application does not need to be perfect. Users will be able to compensate for occasional errors from context. In fact, humans do this all the time. Our ability to understand human speech, as measured by standardized listening tests, decreases substantially if we listen to words in a random nonsense order as opposed to hearing words in a meaningful context. The reading machine also makes errors, the rate of which has, of course, greatly improved in the 16 years since the machine’s introduction.
Early versions of such a listening machine could leave in the discrete speech limitation. While this would not be useful to understand what someone is saying at most meetings, it could enable a deaf person to hold a conversation with a friend or loved one on the phone. The person at the other end would need to speak with brief pauses between words, but this is easily learned and is a much more rapid way of communicating than the slow typing speed of today’s TDDs. Also, the deaf person would be able to communicate with anyone who has a phone. No special equipment would be required other than the deaf person having his or her own listening machine.
Of far greater significance, however, will be the listening machine that can handle ordinary continuous speech. Algorithms for understanding large vocabularies of continuous speech exist, but they require several hundred MIPS (millions of instructions per second) of computer power, which is an order of magnitude greater than today’s PCs. On the other hand, with yet faster microprocessors on the way (the new “Alpha” chip recently announced by Digital Equipment Corporation provides 400 MIPS), the requisite computational resource will be available at reasonable cost within a couple of years. By that time, the necessary algorithmic refinement is also expected to be available. Early versions of a listening machine for the deaf that can handle continuous speech should be expected within several years, with the technology beginning to see active deployment later in this decade.
Seeing the words
Persons deaf from birth or a very early age also have difficulty speaking because of the lack of the vital auditory feedback that allows hearing persons to refine their articulation. Systems are being developed that will provide an effective visual display to show a meaningful picture of speech that can be used to teach deaf children how to speak. Early versions of such systems have already been used. but the visual feedback has not provided the most useful model possible. Meaningful information in human speech exists in the frequency domain. and translating human speech into a model that can be appreciated by a human being visually in real-time requires digital signal processing capabilities of substantial power.
It is only recently that such technology has become affordable, and a second generation of such visual feedback systems available over the next several years should be effective in substantially improving the development of these skills. Similar technology could also be integrated into the listening machines to provide a visual display of nonspeech sounds in addition to the transcription of speech.
If deaf persons can readily understand human speech in real-time and improve their ability to develop speech skills, the most salient handicaps of the hearing disability will be greatly ameliorated The social, educational, and vocational challenges that now confront most deaf persons would be largely overcome.
The body electric
A principal physical handicap is paraplegia–the loss of control over the legs. The most common prosthetic aid for this disability is the wheelchair, which has changed only in subtle ways over the past two decades. It continues to suffer from its principal drawback, which is the inability to negotiate doorways and stairs. Although federal law (the Americans with Disabilities Act) now requires public buildings to accommodate wheelchair access, the reality is that access to persons in wheelchairs is still severely restricted.
By the end of this decade, we will see the first generation of effective exoskeletal robotic devices, sometimes called powered orthotic devices, which will restore the ability of paraplegic (and in some cases quadriplegic) persons to walk and climb stairs. These devices will be as easy to put on as a pair of tights and will be controlled by finger motion, head motion, speech, and perhaps eventually thoughts. Another option, one that has shown promise in experiments at MIT and a number of other research institutions, is direct electrical stimulation of muscles. This technique effectively reconnects the control link that was broken by spinal cord damage.
Other disabilities will benefit as well. Already, people without use of their hands can control their environment, create written documents, and interact with computers using voice recognition. Artificial hand prostheses controlled by voice, head movement, and perhaps eventually by direct mental connection will restore manual functionality. Progress is being made in intelligent courseware to treat dyslexia and learning disabilities and to provide richer learning experiences for the retarded.
Overcoming the handicaps associated with disabilities is an ideal application of artificial intelligence technology. In the development of intelligent computers, the threshold that we are now on is not the creation of cybernetic geniuses. Instead, we are providing computers with narrowly focused intelligent skills, e.g., the ability to make decisions in such areas as finance and medicine and the ability to recognize patterns such as printed letters, blood cells, and land terrain maps.
Most computers today are still idiot savants, capable of processing enormous amounts of information with relatively little intelligence. When one considers the enormous impact that these idiot savants have had on society, the addition of even sharply focused intelligence will be a formidable combination. It will be particularly beneficial for the disabled population. A disabled person is typically missing a specific cell or capability but is otherwise a normally intelligent and capable human being. There is a fortuitous matching of the narrowly focused intelligence of today’s intelligent machines and the narrowly focused deficit of most disabled persons. Our primary strategy in developing intelligent computer-based technology for sensory and physical aids is for the focused intelligence of the machine to work in concert with the much more flexible intelligence of the disabled person.
There are an estimated 20 million disabled Americans. Many are unable to learn or work up to their capacity because of technology that is not yet available, or technology that is available but not yet affordable, or pervasive and negative public attitudes toward disabled persons. As the reality changes, the perceptions will also change, particularly as formerly handicapped persons learn and work successfully alongside their nondisabled peers. By the end of the first decade of the next century, we will come to see “The End of Handicaps” as only a slight exaggeration.
Reprinted with permission from Library Journal, May 1992. Copyright © 1992, Reed Elsevier, USA