Automated ‘coach’ could help with social interactions
June 18, 2013
The software, called MACH (My Automated Conversation coacH), uses a computer-generated onscreen face, along with facial, speech, and behavior analysis and synthesis software, to simulate face-to-face conversations. It then provides users with feedback on their interactions.
Social phobias affect about 15 million adults in the United States, according to the National Institute of Mental Health, and surveys show that public speaking is high on the list of such phobias. For some people, these fears of social situations can be especially acute: For example, individuals with Asperger’s syndrome often have difficulty making eye contact and reacting appropriately to social cues. But with appropriate training, such difficulties can often be overcome.
The research was led by MIT Media Lab doctoral student M. Ehsan Hoque, who says the work could be helpful to a wide range of people. “Interpersonal skills are the key to being successful at work and at home,” Hoque says. “How we appear and how we convey our feelings to others define us. But there isn’t much help out there to improve on that segment of interaction.”
Many people with social phobias, Hoque says, want “the possibility of having some kind of automated system so that they can practice social interactions in their own environment. … They desire to control the pace of the interaction, practice as many times as they wish, and own their data.”
The MACH software offers all those features, Hoque says. In fact, in randomized tests with 90 MIT juniors who volunteered for the research, the software showed its value.
First, the test subjects — all of whom were native speakers of English — were randomly divided into three groups. Each group participated in two simulated job interviews, a week apart, with MIT career counselors.
But between the two interviews, unbeknownst to the counselors, the students received help: One group watched videos of interview advice, while a second group had a practice session with the MACH simulated interviewer, but received no feedback other than a video of their own performance. Finally, a third group used MACH and then saw videos of themselves accompanied by an analysis of such measures as how much they smiled, how well they maintained eye contact, how well they modulated their voices, and how often they used filler words such as “like,” “basically” and “umm.”
Evaluations by another group of career counselors showed statistically significant improvement by members of the third group on measures including “appears excited about the job,” “overall performance,” and “would you recommend hiring this person?” In all of these categories, by comparison, there was no significant change for the other two groups.
The software behind these improvements was developed over two years as part of Hoque’s doctoral thesis work with help from his advisor, professor of media arts and sciences Rosalind Picard, as well as Matthieu Courgeon and Jean-Claude Martin from LIMSI-CNRS in France, Bilge Mutlu from the University of Wisconsin, and MIT undergraduate Sumit Gogia.
Designed to run on an ordinary laptop, the system uses the computer’s webcam to monitor a user’s facial expressions and movements, and its microphone to capture the subject’s speech. The MACH system then analyzes the user’s smiles, head gestures, speech volume and speed, and use of filler words, among other things. The automated interviewer — a life-size, three-dimensional simulated face — can smile and nod in response to the subject’s speech and motions, ask questions and give responses.
“While it may seem odd to use computers to teach us how to better talk to people, such software plays an important [role] in more comprehensive programs for teaching social skills [and] may eventually play an essential step in developing key interpersonal skills,” says Jonathan Gratch, a research associate professor of computer science and psychology at the University of Southern California who was not involved in this research. “Such programs also offer important advantages over the human role-players often used to teach such skills. They can faithfully embody a specific theory of pedagogy, and thus can be more consistent than human role-players.”
One reason the automated system’s feedback is effective, Hoque believes, is precisely because it’s not human: “It’s easier to tell the brutal truth through the [software],” he says, “because it’s objective.”
While this initial implementation was focused on helping job candidates, Hoque says training with the software could be helpful in many kinds of social interactions.
After finishing his doctorate in media arts and sciences this summer, Hoque will become an assistant professor of computer science at the University of Rochester in the fall.