Windows Vista Speech Recognition Remote Execution Flaw

George Ou of Real World It from the ZDnet blogs had posted about a remote execution flaw involving the speech recognition software that comes with Windows Vista, and how, if you have a mic and speakers hooked up, a sound file could execute commands on your computer, because the software sits there waiting for commands. This would make it possible that a remote computer or website could host sound files with delete commands, etc, that would run on your computer because they play through the speakers and then are picked up by the microphones.

I recorded a sound file that would engage speech command on Vista, then engaged the start button, and then I asked for the command prompt. When I played back the sound file with the speakers turned up loud, it actually engaged the speech command system and fired up the start menu. I had to try a few more times to get the audio recording quality high enough to get the exact commands I wanted but the shocking thing is that it worked! Anyone that’s ever visited MySpace knows how many annoying web pages out there that will start blasting loud MP3 music as soon as they enter the page. Source: Vista Speech Command exposes remote exploit

Microsoft finally responded to him by saying it would still be limited by the users access rights and that the User Account Control would prohibit them from executing any administrative level commands.

In order for an attack to be successful, the user would have to have a microphone and speakers connected to their system. In addition, the user would have had to configure the speech recognition feature. The attackers? audio file would then issue verbal commands via the systems speakers that could potentially be carried out by the speech recognition feature. Based on the initial investigation, Microsoft recommends customers take the following action to protect themselves from potential exploitation of the reported vulnerability:

A user can turn off their computer speakers and/or microphone.
If a user does run an audio file that attempts to execute commands on their system, they should close the Windows Media Player, turn off speech recognition and restart their computer. Source: Microsoft confirms Vista Speech Recognition remote execution flaw

George has done even more testing to see just what he could do without administrative privileges, and he was able to delete files and empty the recycle bin, talk about self service.

I’ve also done some further experimentation that this exploit can be very nasty even if it can’t execute with administrative privileges or bypass UAC. I have verified that I can create a sound file that can wake Vista speech recognition, open Windows Explorer, delete the documents folder, and then empty the trash. Then we have to consider the fact that people do leave many web pages open over night and some of those may have rotating flash ads that can play sounds. If that’s not a serious exploit, I don’t know what is. One can always rebuild system files by reinstalling the Operating System, data files can’t be recovered since the vast majority of people don’t backup.

While he did prove that this is possible, he did this on his computer, so, if he has been using the software it is trained to his voice, the real test will be if someone can do it one someone else’s computer, and, only then if they are using speakers and not a headset, as most people probably use a headset instead of the speakers. This could be an issue if the voice recognition software will recognize anyone’s voice and execute the commands, but somehow I doubt that it will. Now, if a hacker uses one of those voice over guys, we could be in trouble…

Added: Microsoft has chimed in on this flaw on the Microsoft Security Response Center Blog, saying that there is little reason to worry about the effects this could have on Windows Vista.

In order for the attack to be successful, the targeted system would need to have the speech recognition feature previously activated and configured. Additionally the system would need to have speakers and a microphone installed and turned on. The exploit scenario would involve the speech recognition feature picking up commands through the microphone such as ?copy?, ?delete?, ?shutdown?, etc. and acting on them. These commands would be coming from an audio file that is being played through the speakers. Of course this would be heard and the actions taken would be visible to the user if they were in front of the PC during the attempted exploitation. It is not possible through the use of voice commands to get the system to perform privileged functions such as creating a user without being prompted by UAC for Administrator credentials. The UAC prompt cannot be manipulated by voice commands by default. There are also additional barriers that would make an attack difficult including speaker and microphone placement, microphone feedback, and the clarity of the dictation. Source: Issue regarding Windows Vista Speech Recognition

The main reason this is possible in Windows Vista and none of the previous operating systems is because it makes for easier operation and extended support for people that have dexterity difficulties or impairments.