MIT researchers have developed a form of artificial intelligence (AI) capable of accurately anticipating human interactions by training it to learn social cues from YouTube videos. It is hoped that the development will further progress in the field of computer vision, specifically towards creating intelligent systems capable of predicting events before they occur.
Researchers from MIT's Computer Science and Artificial Intelligence Laboratory created a deep-learning algorithm that they trained with 600 hours of television shows downloaded from YouTube, including The Office, Desperate Housewives and The Big Bang Theory. That equates to 25 straight days of binge-watching.
TV shows were selected based on their ranking on Google and "generally consisted of people performing a large variety of everyday actions" and interactions with other people.
The AI was then tasked with predicting the outcome of a human interaction a second before they happened on-screen, including kissing, hugging and shaking hands. While it only guessed correctly 43% of the time – compared to a 71% accuracy rate for humans – it still managed to beat the 36% success rate set by previous attempts.
Creating machines that can anticipate the needs of humans before they arise presents a significant challenge in the field of computer vision. It is believed that doing so would make it possible to create smart home assistants that could better predict the intentions of their owners.
Carl Vondrick, lead researcher for the MIT paper, told the Associated Press: "It could help a robot move more fluidly through your living space. The robot won't want to start pouring milk if it thinks you're about to pull the glass away."
The same technology could be applied in surveillance applications, theoretically making it possible to create security systems that could anticipate when an accident or crime was about to occur and contact emergency services beforehand, as envisioned by Philip K Dick in his short story Minority Report back in 1956.
"If you can predict that someone's about to fall down or start a fire or hurt themselves, it might give you a few seconds' advance notice to intervene," said Vondrick.
The researchers will continue to train the artificial intelligence with video in the hope that it will become more accurate over time.