How to ‘talk’ to your computer or car with hand or body poses

July 10, 2017

Researchers at Carnegie Mellon University’s Robotics Institute have developed a system that can detect and understand body poses and movements of multiple people from a video in real time — including, for the first time, the pose of each individual’s fingers.

The ability to recognize finger or hand poses, for instance, will make it possible for people to interact with computers in new and more natural ways, such as simply pointing at things.

That will also allow robots to perceive you’re doing, what moods you’re in, and whether you can be interrupted, for example. Your self-driving car could get an early warning that a pedestrian is about to step into the street by monitoring your body language. The technology could also be used for behavioral diagnosis and rehabilitation for conditions such as autism, dyslexia, and depression, the researchers say.

This new method was developed at CMU’s NSF-funded Panoptic Studio, a two-story dome embedded with 500 video cameras, but the researchers can now do the same thing with a single camera and laptop computer.

The researchers have released their computer code. It’s already being widely used by research groups, and more than 20 commercial groups, including automotive companies, have expressed interest in licensing the technology, according to Yaser Sheikh, associate professor of robotics.

Tracking multiple people in real time, particularly in social situations where they may be in contact with each other, presents a number of challenges. Sheikh and his colleagues took a bottom-up approach, which first localizes all the body parts in a scene — arms, legs, faces, etc. — and then associates those parts with particular individuals.

Sheikh and his colleagues will present reports on their multiperson and hand-pose detection methods at CVPR 2017, the Computer Vision and Pattern Recognition Conference, July 21–26 in Honolulu.