January 18, 2021
Hot Topics:

Review: BeVocal Cafe (Part II)

  • By Hitesh Seth
  • Send Email »
  • More Articles »


SpeechObjects are a set of open, reusable components that encapsulate the frequently used functionality in a speech application--components aren't new to the world of application development. Depending on the choice of your development environment, as a developer you would use either EJB, COM, Microsoft .NET, etc. components as part of your application. The objective of SpeechObjects is to enable the reuse of these components and provide an object-oriented methodology to speech developers. SpeechObjects, which have been defined by Nuance Communications, can be used within the context of a VoiceXML application using the <object> and <param> tags.

BeVocal Cafe supports Nuance SpeechObjects methodology and allows developers to reuse a bunch of SpeechObjects developed by Nuance. In addition, Cafe includes a small set of speech objects which are specific to the Cafe environment and can be used by VoiceXML developers. The table below shows the various SpeechObjects that are supported by Cafe:

SpeechObject Recognizes?
nuance.so.SOAlphaDigitString An alphanumeric string
nuance.so.SOBrowsableList Select an item by reading a sequence of items to the caller
nuance.so.SOConfirm Confirmation dialog
nuance.so.SOCreditCardInfo Credit-card related information
nuance.so.SODate Date
nuance.so.SONATelephoneNumber A 10-digit telephone number
nuance.so.SOQuantity Quantity (e.g. twenty two)
nuance.so.SOSectionedDigitString A sectioned/delimited string
nuance.so.SOSimpleDigitString A fixed-length digit string
nuance.so.SOSocialSecurityNumber SSN
nuance.so.SOTime A time expression
nuance.so.SOUSCurrency Amount in dollars/cents
nuance.so.SOUSZipCode U.S. 5/9 digit postal code
nuance.so.SOYesNO yes/no response
bevocal.cafe.SOAirline Airline name
bevocal.cafe.SOPickStock Equity name
bevocal.cafe.SOCityState City and state
bevocal.cafe.SOStreet A street in a particular city/state
bevocal.cafe.SOStreetNumber A street number in a particular street/city/state

To illustrate the value of SpeechObjects, let's take a look at an example. The VoiceXML code snippet below shows a simple stock trading application prototype which recognizes an equity name or index and returns the name of the equity. The benefit that SpeechObjects brings to the table is clear from the simplicity of the code required to achieve the functionality. For instance, if this were to be coded in plain VoiceXML, the developer would need to create a fairly complex grammar which included all the equities traded on the stock exchange.

Speaker Verification

Whether you develop a client-server, web, wireless or speech application, security is always a concern. A key aspect of application security is authentication. An authentication mechanism allows an application to recognize a valid user for the application. In traditional web applications, authentication is typically handled through a combination of user-id and password. Some more secure web applications also allow the user to use a digital certificate as a token for authentication. In the world of speech applications, application authentication is typically managed through a combination PIN (personal identification number), Full Names (as cryptic user-ids can be hard to recognize), account numbers and/or telephone numbers. For instance a typical authentication dialog for a speech application would be something like:

"Please say or enter your account number" followed by
"Please say or enter your PIN."

This would allow the application to authenticate the user. The world of speech based applications allows a different form of authentication--a user's speech itself. Similar to a fingerprint which serves as a token of identity for a person, a user's natural speech can be constructed into a Voice Print which can recognize the user. Currently, VoiceXML doesn't include pre-built support for Voice Print related technologies, however several vendors such as Nuance and SpeechWorks have built speech verification products into their core recognition technologies. Cafe provides support for Voice Print-based speaker verification to VoiceXML developers through two tags - <register> and <verify>. As the name probably suggests, the <register> tag is used to register a Voice Print of the user into an application, whereas the <verify> tag is used to verify that same voice print. Both tags have a common identifier, the "key expression," which is used to store/retrieve the Voice Print.

The listing below shows how the <register> tag can be used.

Now that the Voice Print has been registered, the <verify> tag can be used to authenticate a user.

Page 1 of 2

This article was originally published on November 12, 2002

Enterprise Development Update

Don't miss an article. Subscribe to our newsletter below.

Thanks for your registration, follow us on our social networks to keep up-to-date