'Computer, on screen!': A look at Google's voice recognition engine

Banner: Test Results

Google's voice recognition technology took to the mobile sector with voice-powered search applications for iPhone, Android and BlackBerry. Naturally, Google's own mobile operating system Android has begun to reap special benefits of the powerful technology with some new voice-enabled features.

Yesterday, an unforced update to Android's native Google Maps application endowed the software with speech recognition capabilities. Addresses, business names, and attractions can all be searched by spoken word. The app is now one of several that tap into Google's speech recognition engine, such as the voice-to-text app which recently turned up in the Android Market, simply named Voice Text for Android. That app allows the users to dictate text messages.

While Android is inching toward full voice command, it has certainly not reached the point where natural speech will be easily transcribed. The engine requires a slower and more deliberate vocal cadence, which is easy to get accustomed to, but suffers even when the speaker's diction is conscientious. Rare and odd words are easily detected, but the most common ones consistently suffer, especially if they consist of fricative phonemes.

Of the approximately 42 phonemes in the American English language, fricatives constitute a small but prevalent handful. These include the /f/ as in "Fish" /v/ as in "Very", /th/ as in "thing," and /th/ as in "this." In our tests of the software with three different microphones, these sounds almost always registered incorrectly.

In Maps, a user's potential input is much more rigid than it would be in text message dictation. For example, in Betanews tests yesterday, the tricky statement "think this through" which we chose for its potentially common appearance, its three interdental fricatives and its potentially misspelled homophone (through/threw), stumped the text transcriber a grand total of twenty times, returning results such as "Sync vs. Subaru" and "St. Francis School." Unfortunately, a seemingly simple sentence can become frustrating when words like "then" and "them" can almost never be understood.

Fortunately in Maps, such terms are less likely to turn up.

Another problem in Maps comes up when looking for streets and towns with American Indian or adopted foreign language names which are so common throughout the United States. The town of Hauppauge, New York which many techies know for the computer company of the same name is practically impossible to find through voice search. The engine turned up such things as "Hot Dog New York" and "Paul Blog, Long Island" in our attempts. Attempting such names as "Quaqanantuck Lane," and "Napeague Meadow" sometimes resulted in hilarious misinterpretations.

And this problem is not strictly an American one either. Google said yesterday that the software recognizes North American, British, and Australian English pronunciation, but when our nearby locations can have names in any number of indigenous languages (Whip-ma-Whop-ma-Gate, anyone?), the engine is presented with recurring difficulties.

2 Responses to 'Computer, on screen!': A look at Google's voice recognition engine

© 1998-2023 BetaNews, Inc. All Rights Reserved. Privacy Policy - Cookie Policy.