Tesla CEO Elon Musk's $1bn non-profit OpenAI has unveiled a new program to train robots to complete a specific task after watching a person demonstrate it just once. The artificial intelligence research company developed a new algorithm called one-shot imitation learning that lets researchers communicate a task to an AI by performing it first in virtual reality and teach it to replicate the physical action.
The proof-of-concept system, revealed on Tuesday (16 May), uses two neural networks to train a robot to mimic a task performed in VR - a vision network and an imitation network.
A human being wearing a VR headset and controller demonstrates the action - in this case stacking blocks.
The vision network then uses information from a camera to interpret the environment and determine the positions of the objects. This network is trained with hundreds of simulated images with different lighting, textures and objects, but has never been trained on a real-world image or video.
"After training, the network can find the blocks in the physical world even though it's never seen images from a real camera before," OpenAI researcher Josh Tobin said in a video demonstrating the imitation algorithm in action.
The second imitation system then takes over, observes the demonstrated task, infers the intent of the action and the steps that a human would have taken in the same situation to complete the task. After learning from the demonstration, the robot (as seen in the video embedded below) stacks a real set of blocks even though they were positioned differently in the demo.
"Our robot has now learned to perform the task even though its movements have to be different from the ones in the demonstration," Tobin said. "With a single demonstration of a task, we can replicate it in a number of different initial conditions. Teaching the robot how to build a different block arrangement requires only a single additional demonstration."
"Nothing in our technique is specific to blocks," Tobin said. "This system is an early prototype, that will form the backbone of the general-purpose robotics systems we're developing here at OpenAI."
The long-term goal is to train and teach AI new tasks and behaviours quickly through imitation and explore how they would behave when tasked with performing other general tasks such as household chores.
"Infants are born with the ability to imitate what other people do," Tobin said. "Imitation allows humans to learn new behaviours rapidly. We would like our robots to be able to learn this way too."