Virtual robots teach each other Pac-Man and StarCraft video games

Teaching physical robots and humans planned
April 4, 2014
PacMan

Pac-man training in progress (credit: Washington State University)

Researchers in Washington State University’s School of Electrical Engineering and Computer Science have developed a method to allow a computer to give advice and teach skills to another computer in a way that mimics how a real teacher and student might interact.

The paper by Matthew E. Taylor, WSU’s Allred Distinguished Professor in Artificial Intelligence, was published online in the journal Connection Science.

The researchers had the “agents” (virtual robots) act like true student and teacher pairs: student agents struggled to learn Pac-Man and a version of the StarCraft video game. The researchers were able to show that the student agent learned the games and, in fact, surpassed the teacher.

In their study, the researchers programmed their teaching agent to focus on telling a student when to act. The trick is in knowing when the robot should give advice. If it gives no advice, the robot is not teaching. But if it always gives advice, the student gets annoyed and doesn’t learn to outperform the teacher. “We designed algorithms for advice giving, and we are trying to figure out when our advice makes the biggest difference,’’ Taylor says.

He aims to develop a curriculum for the agents that starts with simple work and builds to more complex.

Teaching physical robots

“While we currently focus on teaching in video games, our methods can be extended to physical robots as well,” Taylor told KurzweilAI. “This ability is critical as robots become more common. Once one robot has learned about its environment, humans’ preferences, the jobs it needs to perform, etc., we don’t want to lose this knowledge if we upgrade to a new robot. Furthermore, if additional robots are added to the same environment, we don’t want the new robots to have to learn from the beginning — they should be able to learn faster by leveraging the knowledgeable robot.”

Other researchers have allowed one robot to teach another, but they typically make strong assumptions about robot similarity, said Taylor. “If the robots are similar enough, the knowledge can be copied directly from one robot to another, or some type of transfer learning algorithm can be used to modify the data.

“Our goal is to allow the teacher and student robot to be very dissimilar and make no assumptions about the internal knowledge representation. Because of this flexibility, we are currently extending our methods so that an agent teacher can help a human student. The long-term goal is to develop a single framework to allow teaching between: 1) robot-robot, 2) robot-human, 3) human-robot.”

Taylor expects to complete this framework in  5 to 10 years. “It depends in large part how quickly robots with the ability to learn are deployed commercially.”

The work was funded in part by the National Science Foundation (NSF). Taylor also recently received an NSF grant to use ideas from dog training to train robotic agents. Eventually, he hopes to develop a better way for people to teach their robotic agents, as well as for robots to teach people.

The code for the teaching method is freely available here.


Abstract of Connection Science paper

This article introduces a teacher–student framework for reinforcement learning, synthesising and extending material that appeared in conference proceedings [Torrey, L., & Taylor, M. E. (2013)]. Teaching on a budget: Agents advising agents in reinforcement learning. {Proceedings of the international conference on autonomous agents and multiagent systems}] and in a non-archival workshop paper [Carboni, N., &Taylor, M. E. (2013, May)]. Preliminary results for 1 vs. 1 tactics in StarCraft. {Proceedings of the adaptive and learning agents workshop (at AAMAS-13)}]. In this framework, a teacher agent instructs a student agent by suggesting actions the student should take as it learns. However, the teacher may only give such advice a limited number of times. We present several novel algorithms that teachers can use to budget their advice effectively, and we evaluate them in two complex video games: StarCraft and Pac-Man. Our results show that the same amount of advice, given at different moments, can have different effects on student learning, and that teachers can significantly affect student learning even when students use different learning methods and state representations.