Review: IBM WebSphere Voice Toolkit
IBM WebSphere Voice Toolkit is a complete integrated VoiceXML application development platform. Based on IBM's next generation open source Eclipse IDE (Integrated Development Environment) platform (http://www.eclipse.org), Voice Toolkit includes the traditional rich IDE features such as project/view Management, source code control management (with integration with SCM (source code management) tools such as CVS) and an integrated set of VoiceXML based tools including application generation wizards, VoiceXML editor, grammar development and testing tools, debugging, development of both static VoiceXML content and J2EE (Java 2 Enterprise Edition) based dynamic application and a broad set of reusable dialog components. In the rest of the article, we review key features of the Toolkit and the features it brings to the table for VoiceXML based application development. Voice Toolkit supports VoiceXML 1.0 based application development using multiple grammar formats.
Installation of the WebSphere Voice Toolkit requires installation of three components - the Voice Toolkit itself, WebSphere Voice Server SDK and the IBM Reusable Dialog Components. Another component, the Voice Application Debugger (reviewed later in the article) which is currently in beta stage is optional but adds important step-by-step debugging facility. The Voice Server SDK includes desktop versions of IBM TTS (Text-to-Engine) and IBM ViaVoice ASR (Advanced Speech Recognition) Engines. All these components are available for download for Windows 2000 based development environments from IBM Voice Systems homepage. (see the Resources section)
First Looks: The VoiceXML Editor
Perhaps one of the basic and the most common and useful features available in a number of VoiceXML based IDEs, is a VoiceXML editor. IBM's Voice Toolkit's VoiceXML IDE is based on a generic XML IDE but has features which are useful for a VoiceXML application developer such as content assist, bookmarks, tasks. Particularly interesting is the content assist feature which through either a context-sensitive drop down menu or a hotkey (Ctrl-Space bar), provide possible a list of the VoiceXML tags & attributes. The content assist feature is driven based on the DTD (document type definition) based VoiceXML specification (as shown in figure below; click the figure to see a complete IDE). The content assist feature is also customizable, through macros which can be created for tags, attributes and attribute values.
Apart from the development tools for VoiceXML planet, IBM's forte in speech systems is the capability to execute and host Voice Applications (function as a VoiceXML gateway) with products such as IBM WebSphere Voice Server and Integration with IVR (Interactive Voice Response) platforms such as DirectTalk. VoiceXML currently doesn't have a standard for representing creating phonology. However, Pronunciation Builder (screenshots - 1, 2), a component of the VoiceXML Toolkit allows the developer to compose IPA (International Phonetic Alphabet) based pronunciations of unknown words (such as uncommon names or words typically said in a different fashion). For instance you could change the default pronunciation of J2EE to be "J 2 double e" (represented in the IPA as "ʤeɪ tu ˈdʌ.bəl i") instead of the standard "j 2 e e" (represented in IPA as "ʤeɪ tu i i"). The tool automatically adds a reference to the composed pronunciation into the VoiceXML document using IBM's VoiceXML extension tag "<ibmlexicon>" as shown in the following code snippet. These composed pronunciations are then used by the IBM Text-to-Speech Engine to appropriately create the correctly pronounced synthesized speech using the IBM ViaVoice Text-to-Speech Engine.