Designing an Interactive Voice Response System Using VoiceXML and CCXML
Voice interaction is by far the most natural choice of interaction for humans. When it comes to communicating with computers, interaction by voice brings in flexibility and ease of use to the consumer. The caller does not need to type in any information and can be on the move when requesting a service. Besides, a voice interactive system brings in a personal touch so often missing in the transactions taking place over the Internet, enhancing the user's comfort level.
Assuming that there is an underlying system for cognition of the interactions, there are three basic technologies required for an interactive voice response system (IVR):
- Voice recognition: Recognition of a phrase against a finite set of possible matches
- Voice transcription: Speech to text
- Speech synthesis: Text to speech (TTS)
Aim of the Article
In this article, you will learn the real-life implementation of a voice user interface in a travel portal. You will start with the basic design of the IVR. You will identify the various challenges of the system and the shortcomings of the preliminary design and proceed towards updating accordingly. You will conclude with the final architecture of the IVR.
Description of the Implementation
The implementation in discussion is a voice portal for travel planning. In this voice portal, the caller first will be authenticated by using an automatic voice recognition system. After the authentication, based on the role, the caller will be authorized to avail the travel related business service offered by the portal. He should be able to search various flights and plan a trip. The caller will be guided by the voice interactive system. Based on his consent and preferences, the caller can listen to various advertisements while his request is getting processed.
Designing the Voice User interface (VUI)
The voice user interface is developed using VXML2.1 and CCXML 1.0 and hosted on Voxeo Prophecy server 8.0. Nuance grammar specification language (GSL) is used to specify the grammar of the communication.
VoiceXML is a W3C standard XML format for specifying interactive voice dialogues and is used as a mechanism for building the VUI here. VoiceXML handles synchronous communication and allows you to interact with one user at a time, handling events thrown during interactions with that user through an audio channel. VoiceXML supports both application-directed as well as mixed-initiative interactions with the user.
In this application, the basic VUI is comprised of the various VoiceXMLs, as shown in Figure 1:
Figure 1: The Main VoiceXML Flow
<form id="Welcome"> <property name="bargein" value="false"/> <block> <prompt> Welcome to Travel Search Application <break time="1.5s"/> </prompt> <!-- <goto next="#getWebservice"/> --> <goto next="#getUserCities"/> </block> </form> <form id="getUserCities"> <property name="bargein" value="false"/> <field name="srcCity" expr="undefined" cond="true" modal="true"> <clear namelist="srcCity"/> <grammar src="grammar/city.grammar#CITY" type="text/gsl" mode="voice" /> <prompt bargein="false">Which city please? <break time="1.5s"/> </prompt> <filled namelist="srcCity" mode="all"> <assign name="departurecity" expr="srcCity$.interpretation.city"/> <assign name="departureCityCode" expr="srcCity$.interpretation.code"/> <prompt>Your city is <value expr="departurecity"/> <break time="1.5s"/> </prompt> <goto next="#getUserListing"/> </filled> </field> </form> <form id="getUserListing"> <property name="bargein" value="false"/> <field name="listing1" expr="undefined" cond="true" modal="true"> <clear namelist="listing1"/> <grammar src="grammar/Airlines.grammar#AIRLINES" type="text/gsl" mode="voice" /> <prompt bargein="false">What Listing please? <break time="1.5s"/> </prompt> <filled namelist="listing1" mode="all"> <assign name="listing" expr="listing1$.interpretation.choice"/> <prompt>You have selected <value expr="listing"/> <break time="1.5s"/> </prompt> <goto next="#connectUser "/> </filled> </field> </form> <form id="connetUser"> <property name="bargein" value="false"/> <field name="answer"> <prompt bargein="false"> Would you like me to help you find the lowest fare before I connect you to <value expr="listing"/> <break time="1.5s"/> </prompt> <grammar src="grammar/yesno1.grammar#YES_NO"/> <filled> <if cond="answer=='yes'"> <goto next="#getWillDep"/> <elseif cond="answer=='no'"/> <goto next="#getAdvertise"/> <else/> </if> </filled> </field> </form>
Listing 1: Snippet from userDetails.vxml