Bilingual, Bicultural ‘Roboceptionist’

October 19, 2010

The roboceptionist Hala exhibits facial gestures, and the monitor turns side-to-side like a head. Hala also is provided with a back story, a history and a personality to encourage people to converse with her. (College of Social and Behavioral Sciences)

Researchers at the University of Arizona and Carnegie Mellon University are working to create a robot receptionist. What makes the effort novel is that the “roboceptionist” is a bilingual and bicultural computer with a face and a natural language interface.

A three-year, $1 million grant from the Qatar National Research Foundation is funding basic advances in human-computer interaction. Majd Sakr, a professor at Carnegie Mellon University in Qatar, is the principal investigator on the grant.

Reid Simmons, a professor at Carnegie Mellon’s Robotics Institute in Pittsburgh, and Sandiway Fong, a UA associate professor of linguistics and computer science, are the co-PIs. Carnegie Mellon University will provide the robotics innovations, while the UA will supply the language technology.

Fong spent part of his sabbatical leave in 2009-10 at Carnegie Mellon’s Qatar campus where an existing version of the roboceptionist, called Hala, greets visitors in English and Arabic.

“They invited me to join the team because they saw opportunities to generalize and improve the limited language capabilities of Hala,” said Fong. “They needed Hala to be more flexible in dealing with language input from users. And that is where the University of Arizona comes in. We will provide both the language-specific and inter-language-related cultural capabilities so that this robot can be not just bilingual, but bicultural.”

Fong said a bicultural robot is not one that merely switches between English and Arabic, Hala’s current format, but also has both modes simultaneously active in order to spot and deal with potential cultural ambiguities and misunderstandings.

“You may speak Arabic, but you may choose to converse with the robot in English,” said Fong. “You may be conversing with the sensibility and the cultural background and the idioms from the Arabic world. This robot needs to understand both.”

The phrase “week after week,” for example, “I’m looking for the group that meets week after week” means “every week” in English. But in some Arabic dialects it can mean “every other week.” Only a robot that is simultaneously facile with both lexicons can compute that this phrase is subject to cultural variation and can ask the user for clarification.

Culture affects not only the syntax and semantics of an interaction, but also the structure of the interaction, from the way greetings and closings are performed, to the form of the language used to politeness strategies.

“In American culture, we quickly greet someone and then we tend to get down to business,” said Fong. “In Arab cultures, it is rude to actually get down to business right away. There is much more turn-taking in greetings. Hala will know this.”

Hala, who has a built-in backstory and personality, will adjust responses based on cues from the visitor, essentially building a model of the user throughout the interaction: Is this person high-status? Is this person American? Is this person in a hurry?

About two-thirds of the grant will support research activity in Qatar. Part of that money may be available for UA students to travel to Qatar to work on the project on internships. The remaining one-third will directly support work in Pittsburgh and Tucson.

“This is wonderful recognition of Sandiway’s work in this area and a great opportunity to showcase our growing strengths in computational linguistics and human language technology,” said Michael Hammond, head of the UA linguistics department. “We’re hopeful that this is only the beginning of a productive relationship with Carnegie Mellon and Qatar.”

Fong has hired an incoming linguistics graduate student, Samantha Wray, who has lived in Yemen and speaks Arabic, to help. Wray is working on creating a database of questions to train the robot.

Most natural language parsers — computer translators that check for correct syntax — are trained on what is known as a Treebank corpora, or parsed sentences, generally from newswire text. News stories, however, do not usually contain questions, greetings or dialog. Wray has found transcripts of interviews from the Arabic-language television network Al Jazeera that could be data-mined for additional corpora material. The researchers will diagram, parse and generalize the data to help “teach” the robot how to understand human input in both Arabic and English.

“I’m very excited to be part of the project because I think that in developing these sorts of programs and in working with these sorts of robots, we eventually learn more about how we has humans function, how we learn and how we interact on a communicative level,” said Wray.

Fong has plans to hire another student and to provide internships for students in the master’s program in Human Language Technology that he directs.

The possible applications of this work for other human-computer interaction systems is also driving the research. Fong thinks the technology could be applicable to computer help-agents, multicultural information kiosks, tour guides and automated international call centers.

Said Fong: “We foresee a future in which robots will help bridge gaps between people of different cultures, acting as intermediaries, to enable them to communicate more naturally and effectively.”

Adapted from materials from the University of Arizona