Researchers have shed light on a worrying loophole that could be used by hackers to take control of your smartphone using its built-in voice recognition software. By burying mangled voice commands within YouTube videos, a team of university professors in the US found that they could instruct a nearby device to carry out potentially harmful actions.
Voice interfaces have become a common feature on modern smartphones, allowing us an extra level of convenience in diary planning, hands-free messaging or just settling pub debates on pointless trivia. Yet while the likes of Siri, Cortana and Google Now have made our smartphones smarter, they have also unwittingly added a whole new level of vulnerability.
The ability to control smartphones using voice commands has been highlighted before, most notably by French infosec agency ANSSI in 2015. It demonstrated how Siri voice commands could be silently triggered from up to 16 feet away by sending radio waves to an iPhone with a headset plugged into it.
The latest research shows it does not even have to be that complicated, and that a smartphone can be instructed to carry out actions though a rudimentary YouTube video.
In a video demonstration, researchers from Georgetown University and the University of California, Berkeley, were able to activate an Android smartphone using the "OK Google" trigger. The audio command had been mangled to make it less intelligible to human ears, yet could be understood easily by the software of the nearby phone. From there, the researchers were able to make the device open web pages and turn settings on and off.
This method could be used to carry out attacks of varying severity, from simple denial of service by activating a device's aeroplane mode, posting a user's location information on social media, or directing the phone to a harmful URL infected with malware (which could, of course, result in a more complete take-over of the target device). While the voice commands are fairly legible in the video demonstration, the researchers add that anyone with an in-depth knowledge of speech recognition systems could construct hidden voice commands that humans could not decipher at all.
What makes this vulnerability even more dangerous is the fact it requires very little technical know-how to carry out. "Hidden voice commands can be constructed even with very little knowledge about the speech recognition system," the research paper explained. "We provide a general attack procedure for generating commands that are likely to work with any modern voice recognition system."
The paper concludes that better speaker recognition tools, or voice authentication, should be included within smartphone voice interfaces so that they can only be controlled by the device's owner. It also proposes the implementation of "filters" that ensures only clear voice commands are executed by the phone.
The team's findings are due to be presented in August at the USENIX Security Symposium in Austin, Texas.