Review: BeVocal Cafe (Part II)
A finger-print, hardware token, digital certificate or any other such mechanism requires changes to the way users communicate with an web application. In the speech application scenario, the authentication naturally "fits in" with the user interface (which is speech) and doesn't require the user to adapt to any external identification mechanism. In fact, in a number of scenarios, interactive speech applications could be the de-facto choice due to the additional security that can be achieved through the mechanism of speech verification.
Typically in a VoiceXML based application, grammars are constructed based on rules that are created by the VoiceXML application, and these rules are based on grammars constructs and phrases. Some applications, such as a dynamically maintained address book, require the application to recognize entries (or at least part of the entry). One of the features that were recently added to the Cafe are known as "voice enrollment," which is an extension created for VoiceXML to suit this purpose. Enrollment works as a basic two step process. Initially the user records prompts in his/her voice for a grammar. Each prompt is assigned a value/expression which is returned when this prompt is recognized as part of the grammar. The listing below shows a simple enrollment process.
Once enrolled, the enrolled prompts can be recognized as a grammar as specified by the enroll tag. It is important to understand that the voice enrollment facility only works for individual assigned callers, as identified by the speakerID attribute. In most scenarios a VoiceXML application would cater to multiple telephony users. As part of the dynamic server-side application it is crucial to dynamically assign a unique speakerID for all speakers that use the voice enrollment.
Voice Enrollment is currently not part of the VoiceXML specification and is a BeVocal Cafe specific extension.
If you have you stayed in a hotel and requested the operator to set up a "wake-up call," typically the call is set up in an automatic fashion. The hotel system typically has some sort of an automatic interface which takes the time for the call and the room number as an input and as an end result you get a call in your room at the preset time. "Voice Alerts" or the "outbound VoiceXML calling interface" is similar to the wake-up call paradigm. In the VoiceXML-based solution scenario, a "Voice Alert" application initiates a call to particular phone number and instead of a static recorded message, connects the user to a dynamic VoiceXML application. Consider a stock trading application where the user needs to be notified if any particular stock in his/her portfolio goes up/down by a certain percentage. An outbound calling VoiceXML application scenario could be established which would notify the user of the event and also facilitate any related transaction (such as sell 100 shares or buy 100 shares).
BeVocal Cafe provides the outbound VoiceXML interface and allows developers to build event-driven interactive applications. To create a Voice Alert and connect it to a particular phone number and a VoiceXML application, Cafe provides a simple HTTP-based interface to initiate the outbound process. The interface takes three main parameters as input: dest - the destination phone number, vxml - the VoiceXML Application URL and key - an authentication token which validates the identity of the application invoking the interface. To use the Cafe outbound service, you need to email@example.com to get a valid key for your scenario.
In a nutshell, Cafe provides a function-rich environment for speech application developers by supporting multiple grammar formats, including a set of SpeechObjects and supporting functionality such as voice enrollment, speech verification and outbound calling. From an execution perspective, Cafe provides debugging and simulation tools such as a Vocal Debugger, Vocal Player and Vocal Scripter. For an overall VoiceXML development strategy, however, BeVocal Cafe lacks development tools for the actual construction of the VoiceXML application. My picks for future enhancements to Cafe would include a standalone grammar debugging tool and an overall development-focused environment (desktop or remote) which would jumpstart VoiceXML application development by providing code generation wizards for constructing VoiceXML applications.
About Hitesh Seth
Hitesh Seth is Chief Technology Evangelist for Silverline Technologies, a global eBusiness and mobile solutions consulting and integration services firm. He is a columnist on VoiceXML technology in XML Journal and regularly writes for other technology publications including Java Developers Journal and Web Services Journal on technology topics such as J2EE, Microsoft .NET, XML, Wireless Computing, Speech Applications, Web Services & Integration. Hitesh received his Bachelors Degree from the Indian Institute of Technology Kanpur (IITK), India. Feel free to email any comments or suggestions about the articles featured in this column at firstname.lastname@example.org.
Page 2 of 2