Wrox Press Book – Early Adopter VoiceXML Part 2
Chapter 7, VoiceXML with XSLT (HTML and WML)
This is the second half of Chapter 7, from WROX Press’ upcoming book, Early Adopter VoiceXML. We’ll continue where we left off last time, working through the myrubberbands example.
Finally, to provide the user with detailed order information, the full product name and description must be available. This will also allow the user to ask for a product list, and eventually we can perhaps extend the interface to enable products to be ordered by voice. Note that the <product_list> is not associated with any particular <customer_record>.
address_type=”Ship To Address” customer_id=”1″/>
<order_date sayas=”May 18, 2001 at 16 17 hours”>
<product id=”1″ quantity=”3″/>
<product id=”3″ quantity=”1″/>
Also note that, ideally, the <product_list> wouldn’t actually be in the same document as the customer data. However, we’ll keep everything in one file here to avoid the issue of linking between documents. It might amuse us to picture the harried developers reaching the same conclusion to save time and give themselves some chance of meeting their beloved boss’s deadline. Later on, they will no doubt want to refine the process and generate smaller XML documents that can be processed more quickly.
<product id=”1″ name=”MIXED1000″ price=”1.99″>
Mixed Bag of 1000 Rubber Bands
<product id=”2″ name=”MIXED5000″ price=”4.09″>
Mixed Bag of 5000 Rubber Bands
<product id=”3″ name=”RED1000″ price=”2.19″>
Bag of 1000 Red Rubber Bands
<product id=”4″ name=”RED10000″ price=”17.49″>
Bag of 10000 Red Rubber Bands
<product id=”5″ name=”BLUE1000″ price=”0.99″>
Bag of 1000 Blue Rubber Bands
<product id=”6″ name=”BLUE10000″ price=”8.99″>
Bag of 10000 Blue Rubber Bands
This example is formatted to fit the space above, and for readability, adds quite a bit of whitespace between <product></product> tags that would probably not occur in a real document.
Now we have examined the existing database, outlined a suitable voice interface for it, and defined our source markup language and the method for generating it. Now, we are ready to create a stylesheet to convert it to a VoiceXML form implementing the design we decided on in the previous section Designing A Voice Interface. This stylesheet, myrubberbands2vxml.xsl, is quite lengthy, and can be found in its entirety in the code download. Here, I shall pick out just the important points in the code for discussion; including the dynamic generation of grammars, some VoiceXML features worthy of particular attention, and fundamental XSL concepts used.
Note that the stylesheet is designed to produce a single VoiceXML document containing just one user’s data. So, its top-level template only matches documents where the top-level attribute export_type is set to single. The indent attribute on the <xsl:output> tag will produce a well- formatted result document that will be easier for a human brain to examine.
<?xml version = "1.0"?> <xsl:stylesheet xmlns_xsl= "http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output method="xml" encoding="ISO-8859-1" indent="yes"/> <xsl:template match= "myrubberbands[@export_type='single']/customer_record"> <vxml version="1.0"> <meta name="author" content="Underpaid Myrubberbands Engineer"/> <meta name="copyright" content="Copyright (C) 2001 Myrubberbands.com"/>
The next block illustrates one way XSL can generate elements with an attribute having dynamic content: the <xsl:element> construct.
<xsl:element name="meta"> <xsl:attribute name="name">description </xsl:attribute> <xsl:attribute name="content">Voice Interface for # <xsl:value-of select="customer/@id"/> </xsl:attribute> </xsl:element>
We will need to set up some variables for use in the VoiceXML document. First off, we grab the user’s Automatic Number Identification (ANI) and Dialed Number Identification Service (DNIS) for later use. These correspond to the phone number that originated the call (analogous, but not identical to, the consumer caller ID service) and the number that the user dialed. The implementation of these is system dependent, and the data may not be available for all calls in any case. They are included here mainly for illustration. In a real application, the ANI can be used for auto-identification of the user.
The form_pointer variable will be used for navigation later.
<var name="customer_ani" expr="session.telephone.ani"/> <var name="customer_dnis" expr="session.telephone.dnis"/> <var name="session_error_count" expr="0"/> <var name="form_pointer" expr="'mainMenu'"/> <var name="user_command" expr="''"/>
Next come the form level help dialogs, which here attempt to mimic typical responses likely from a real life call center, contrary to the advice of Chapter 6:
What seems to be the trouble?
Come on – isn’t this easy enough to understand?
Hey <xsl:value-of select=”customer/firstname”/>,
are you stupid or something?
The design specifies that the main menu command is always available. We can implement that with a global VoiceXML <link> element:
Again, we use the information from the XML file to customize the prompts for the user. This form welcomeMessage corresponds to the “welcome message” box in the interface design diagram earlier.
<prompt bargein=”false” timeout=”0.1s”>
Hello, <xsl:value-of select=”customer/firstname”/>.
Welcome to the my rubber bands
dot com voice order status system.
Next, we come to the mainMenu form, the primary form of the voice application. There is a form level <nomatch> element here to transfer control flow to the errorHandler form when an utterance doesn’t match the grammar. This form is used for most no-match events throughout the application, to keep track of the total number of such errors that have occurred this session. The <noinput> handler here ensures the main menu is repeated when the user doesn’t respond to the prompt. In a future version of the product, the designers may implement some kind of timeout to restrict the number of loops, and disconnect the user if there is no response for a long time, but this issue need not concern us now.
<assign name=”form_pointer” expr=”‘mainMenu'”/>
<prompt bargein=”true” timeout=”3s”>
This is the main menu.
You can say product list to hear a list of products.
You can say order status to check your order status.
You can say frequently asked questions to get more
You can always say main menu to return to this
menu, or help for additional help.
product list |
more information |
frequently asked questions |
In order to keep track of the currently active form, the global variable form_pointer is set. This could be used to implement specific navigation logic, for example, by changing the behavior of a command slightly depending on where the command originated. Once the prompt has been played, and the user has responded with an utterance matching the inline grammar, the <filled> handler for this <field> is entered. This copies the value returned by the grammar to the global variable user_command, which is used by the navigator form to direct control flow to the required form. We could also use <subdialog> and pass a parameter, but this is simpler, and sufficient for this application.
<assign name=”user_command” expr=”userSaid”/>
Now our main menu form is finished, we can start to implement the dialogs that provide the application’s basic functionality. First up is the orderStatus form that makes the most extensive use of dynamic content generation:
<assign name=”user_command” expr=”‘main menu'”/>
We need a test here to check that the customer does indeed have outstanding orders, and play a message to that effect:
<prompt bargein=”true” timeout=”1s”>
customer_record/order_history/order) != 0″>
This is a list of all your orders.
customer_record/order_history/order) = 0″>
You have not placed any orders within the last thirty days.
Note, the lines (“test=”) in the above code are only broken to display on this page–in actuality they would not be broken.
The next block creates a prompt for each order that was in the source XML document by using the XSLT element <xsl:for-each> to select XML elements that match the XPath in its attribute. The sayas attribute from the <order_date> element in our XML file is used here to provide an audio cue to identify the order to the user.
you placed an order. Say order number
to hear more about it.
product list |
more information |
frequently asked questions |
order status |
The grammar also includes an option dynamically generated by the XSLT code. The user can say, “order number one” to access data on the first order in their list. This JSGF could be improved to accept shorter instructions, such as “order one”, or even “one”, but be aware that using “order number” followed by the number will help the ASR system correctly identify the user utterance, and will probably improve application performance in this situation.
Join us next week for the rest of Chapter 7 from Early Adopter VoiceXML.
This book excerpt comes to us from WROX Press–technical books that you can count on!