IBM Donates Speech Technology to Open Source Community
At SpeechTEK 2004, IBM announced a decision that it anticipates will elevate speech technology in the marketplace: Big Blue has contributed software to the Apache Software Foundation. By releasing its source code, IBM hopes to bring a quick end to battles over competing, proprietary standards and solidify the industry behind VoiceXML.
Specifically, IBM is contributing Reusable Dialog Components (RDCs) to the Apache Software Foundation. According to IBM, RDCs are Java Server Pages (JSP) tags that enable the development of voice applications and multimodal user interfaces and automatically generate W3C VoiceXML 2.0 at runtime; providing developers with a standard way or authoring VoiceXML applications. VoiceXML is a platform independent speech standard backed by over 20 leading speech product vendors.
RDCs perform common functions in speech-enabled infrastructure applications. On a functional level RDCs are similar to building blocks that can be pieced together like Legos to build larger and more complex applications regardless of the vendor. When separate, RDCs handle basic functions such as date, time, currency and locations; when aggregated they can evolve to a higher order of operation.
IBM is determined to leverage RDC to drive the proprietary, vendor-specific code out of the speech ecosystem. If RDC catches on, companies will be able to construct applications by picking "best of the breed" components from a variety of vendors.
"We want to grow the market through open standards by allowing everybody to play in the same sandbox," Brian Garr, IBM's Program Director for Conversational Access, IBM Pervasive Computing division told BetaNews. "We want to see all of the boats rise together."
In addition to having contributed architecture, infrastructure and some tags through its disclosure of RDC, IBM has proposed a project at the Eclipse Foundation to donate markup editors for W3C speech standards. IBM is offering VoiceXML compliant markup editors to make it easier for developers to create by establishing a standard way of writing standards-based desktop and Web applications.
"Since its initial $40 million contribution to launch Eclipse in November of 2001, IBM has continued to contribute to making Eclipse an open platform for application development and integration," said Mike Milinkovich, Executive Director of the Eclipse Foundation. "With this project proposal, IBM is taking another step toward propelling innovation and giving Java developers the tools to work speech technology into their applications," said Gary Cohen, General Manager, IBM Pervasive Computing.
Microsoft's Speech Server 2004 is based upon the Speech Application Language Tags (SALT) standard, a standard that competes with VoiceXML. Microsoft has interfaced its SALT-based software development kits with its popular Visual Studio .NET, but, despite that added benefit, SALT is still limited in its penetration. Microsoft's approach is also tied to the Windows platform whereas VoiceXML is platform independent.
In another signal that VoiceXML's momentum may be driving past SALT, an IBM spokesperson told BetaNews that Microsoft is currently engaged in talks to make its technology fit into the VoiceXML specification.
Commenting on the differences between the competing standards, Senior Jupiter Analyst Joe Wilcox told BetaNews, "IBM's strategy highlights the importance of speech as an emerging platform for businesses--and from a company that I would consider an old hat in voice recognition and response, so to speak. Underlying IBM's open source effort is a tug-a-war over emerging standards and also development platforms. IBM's technology backs VoiceXML 2.0 as a standard, J2EE for development and, presumably, various operating platforms.
"By contrast, Microsoft's speech technologies favor SALT, Visual Studio .NET development tools and Windows Server 2003. Each camp provides tools that would let businesses add speech technologies to their applications, but Microsoft's approach is for the Windows world, while IBM's approach could provide speech technologies that scale better across heterogenous platforms," said Wilcox.