It's light years beyond actual usefulness. I've said it before, current systems are lacking contextual speech recognition.
For instance, if it's assumed that I speak English and I say:
- The three kids were going to school
It's obvious "three" should never be interpreted as "tree", yet, even Siri has trouble with homophones when not pronounced like the best English on Earth.
Languages like Japanese feature even more homophones, furthering the importance of contextualization.
In my opinion, voice recognition will never be there until electronics are able to observe and comprehend our surroundings. For instance, the Xbox should know that when I look at the TV and shout "Xbox", I mean to activate Kinect's voice command feature.