Overview
I’ve been evaluating a number of VoiceXML tools over the past few months and will be sharing theresults of my research here on VoiceXML Planet. An important part of developing VoiceXML applicationsis deciding which environment and toolset to use. In this article, we’re going to take a look atNuance V-Builder, version 1.2. Right now, Im grouping VoiceXML development tools into three categories:Complete VoiceXML development environment
This is an application that has almost all of the tools necessary to build, test and deply aVoiceXML application. This includes tools to build static VoiceXML files, dynamic VoiceXML scripts (JSP, ASP, etc.) and grammars. A complete development environment must also provide the abilityto test the application without having to deploy and test it by dialing into a VoiceXML gateway.To accomplish this, the tool must have a TTS and ASR engine as well as a VoiceXML interpreter.Testing dynamic scripts does require a Web server, such as IIS or Apache, both of which can runon a development machine.Basic VoiceXML development environment
This category includes tools that have VoiceXML and grammar editing/validation functionality aswell as some testing capabilities; TTS but not ASR capabilities for example. These tools make iteasier to catch common errors before copying the code to a test server, but they do not allow oneto perform all tests within the application.VoiceXML editor
These applications provide VoiceXML and grammar editing/validation capabilities, but do not allowdevelopers to test the application within the editor.V-Builder falls into the first category. It is truely a complete VoiceXML development environment.
Installation
V-Builder can be downloaded for free from their developer site at http://extranet.nuance.com. You will need to register as adeveloper first before you can download the tool. For me, the installation process was painless. I installed it on a fresh Windows 2000 box without having any other Nuance tools installed. The application is written in Java, but comes with a copy of the Java Runtime Environment, so the installation should be easy enough for anyone that has experience installing Windows software (by hitting the Next > button).The hardware requirements are fairly high, which is understandable considering the fact that thetool basically provides you with a desktop VoiceXML gateway complete with a TTS, ASR, and VoiceXML interpreter.Because of this, you will not be able to use it effectively with anything less that 256MB of RAM. Iran my tests on a 1GHz PIII processor with 256MB of RAM. If you will also be running an application and/or Web server for dynamic scripting, you will probably want 512MB to prevent the machine from swappingmemory to disk, which renders V-Builder unusable for testing purposes. V-Builder will run well with aslittle as 128MB of RAM if it’s only used to develop, but not test, applications.
Once V-Builder is installed, there are a number of additional packages that need to be installed, including one or more language packs. Fortunately, Nuance provides built-in support for over 30 languages and dialects, so chances are, your target language will be supported inV-Builder. You will also want to install Nuance SpeechObjects and the sample grammars and VoiceXML applications that will help get you started.
One of the highlights of V-Builder is the ability to automatically download and install packagesfrom within the tool with the AutoUpdate Wizard (screen shot). The wizardconnects to an update server, queries for a list of packages based upon the packages you’ve alreadyinstalled, and provides a list of available selections. Once you’ve made your selections, the packages are downloaded and installed for you. This really makes the installation process a breeze.
Interface Layout
The interface is very intuative and is simple to learn. Icons representing each of the VoiceXML elements aredisplayed in a menu on the lower left side of the screen. Building a simple VoiceXML document is a matter of dragging an element from the menu into the main composition window. One of the neat things about this feature is that it wont allow you to drop elements where they don’t belong. This provides simple enforcement of the VoiceXML standard and reduces the hours that would have been spent troubleshooting bugs as a result of improperly nested elements.A set of menus on the right-hand side of the screen changes based upon the element that is currently selected in the main composition window. It provides fields to fill in values for the element attributes. In some cases where the attributes have been pre-defined in the specification, a dropdown list or set of radio boxes appears. This reduces the learning curve and transition into developing VoiceXML documents and applications for authors that are just getting started with VoiceXML. The Sourcetab in the main composition window provides source editing capability for developerswho want direct control of the code. Thankfully, you can switch back and forth between Designand Source modes without losing your changes. If you were to add a form in the source for exampleand then switched back to design mode, the new form would be represented as a new box in the dialog.
The top left side of the application contains a number of folders that contain the dialogs (or VoiceXML content), grammars (in GSL format) and prompts (recorded prompts). You will also see SpeechObjects menus for prompts and grammars if you installed the SpeechObjects packages. These pre-made components providea great deal of functionality that would otherwise take you tens of hours to develop on your own. From the Project menu, you can add new or existing prompts, dialogs and grammars to your project. Once new componentshave been added to a project, they can be selected and dropped into a dialog in the compose window.
Dynamic Scripting
V-Builder provides a menu of draggable elements on the bottom left pane of the application for embedding scripting tags for ASP, PHP, and JSP. This enables developers to stay in the tool when its time to make the app more dynamic or access database content. You won’t find language-specificdebugging features as you might in a tool such as Microsoft Visual Studio, but it is suitablefor experienced developers who don’t really need those features anyway. You can intermingle dynamictags with grammars and VoiceXML elements. Still, some developers who have a good handle on VoiceXMLmay prefer to drop back to their preferred editor when developing back-end scripts that querydatabase content through a Web or application server.
SpeechObjects
Speaking of database content, V-Builder includes a number of draggable SpeechObject components that make querying a database, scraping content from a Web page, or submitting form content to back-end Web scripts easy.Of course, this can also be done with the embedded scripting tags as well, but its a nice addition that rounds out a terrific tool. SpeechObject components are accessible from a draggable menu of items inthe bottom left hand portion of the interface. Components are accessed from within the VoiceXMLsource via the <object> element. If you decide to use the SpeechObjects components, you shouldmake sure that your production platform fully supports the Nuance platform including SpeechObjects.Recording prompts
Another valuable feature is the ability to create and record prompts and drop them into applications.This is done by loading and editing an existing .wav file, or by creating a new prompt in your project. Whencreating a new prompt, you are provided with a text field where you can type the prompt script for whoeverwill be recording the prompt. You can then have your voice talent record and review the prompts by selectingeach prompt and pressing the record button. Once you’re satisfied with the prompts, you can select and drag theminto a dialog from the prompts menu.Testing Applications
V-Builder isn’t simply a VoiceXML editor. With the exception of Vocalizer, Nuance’s Text-To-Speech engine, V-Builder includes the software you need to actively test the application on your workstation as though you were dialing in through a gateway. This is possibly the feature that sets V-Builder apart from its competition. By hitting the green play button, V-Builder starts up the Nuance server software and, using the sound card, allows you to interact with the application using the same software that would be run in production. The fact that the software used to test VoiceXML applications is essentially the same software you’d have on a Nuance VoiceXML gateway means that if the application works ok in V-Builder, it will probably work the same way in production. This is also a great time saver where you may have otherwise spent several hours tuning your prompt delays and synthesized output overthe telephone. With VoiceXML, one of the things I discovered early on is that your application won’t flow the way you think it will. Being able to run the application through the Nuance software is a time saver. Remember, V-Builder comes with everything you need to test a VoiceXML application. In fact, I oftenuse V-Builder when conducting VoiceXML training seminars so that I don’t have to go through the hassleof lugging around a separate VoiceXML gateway or dialing up a voice gateway everytime I want to testa piece of VoiceXML code.Probably my favorite V-Builder feature is the ability to test grammar files. By selecting the Test grammars button at the top of the interface, you can test any of the grammars in your projectby selecting it from a dropdown menu and pressing the Recognize button and speaking into the computer’smicrophone. A detailed report of the match results against the grammar is displayed after you stop speaking.There’s also a button to replay the utterance that was just matched so that you can compare the resultsword for word. Since grammars are the most difficult part of developing a production qualityVoiceXML application, this tool provide a great resource for tuning and refining your grammars beforeyou deploy them to a VoiceXML gateway for testing.
Deploying Applications
Once you’ve developed and tested the application in V-Builder, you can publish the application to your production Web server or gateway. This makes production rollouts easier for Web developers who may not have a good handle on packaging, distributing, and installing software. It uses the WebDav protocol, whichis usually turned off by default on most Web servers, so you will have to turn it on if you wantpush button publishing capabilities. For me, FTP is just as easy.Weaknesses
Even though V-Builder is a strong tool, it still has its weaknesses. This is forgiveable given that it isas new as VoiceXML. A feature that still needs some work is the graphical grammar builder. Like the Designand Source views for VoiceXML documents, V-Builder includes a diagram view for creating and editing GSL grammars.While I think a visual representation of a grammar can be helpful, V-Builder has a tendancyto erase the contents of the grammar when switching from the diagram to the source view. Also, the ASRengine can sometimes eat up so much memory that it causes V-Builder to crash. This can probably besolved by adding additional memory however. Every now and then, I get an erronious non-critical Java errorthat causes the interface to do strange things, requiring me to restart the application. I’m surethese issues will be worked out in future releases of the product. Also, keep in mind that theseerrors are irregular. They happen to me maybe once or twice over the course of a day under whichmy machine is usually using all available memory to run several large applications in addition toV-Builder.The biggest potential weakness of V-Builder is that it’s tied to the Nuance platform. If you willbe deploying VoiceXML applications to a gateway that doesn’t run the Nuance ASR, TTS, and Voyager orthe Voice Web Server, then your application may not function. If you are using any of the top threevoice ASPs (Voxeo, BeVocal, and TellMe), then your application will likely run fine. Additionally,many of the VoiceXML gateway vendors do provide support for the Nuance platform. If you will bedeploying to a SpeechWorks or IBM Voice Server platform, applications developed in V-Builder willprobably not work. If you’re aren’t sure whether or not your vendor or ASP supports V-Builder, youmight call them up and ask. They’ll probably know the answer.
Conclusion
Overall, I give V-Builder a 4 out of 5 rating. I didn’t give it a 5 because of the few minorbugs that still exist. Also, the tool does not yet support the VoiceXML 2.0 specification andgrammar format. From what I hear, these issues will be resolved in the next release of the productand will likely receive a 5 from this reviewer in the future. From examining and using V-Builder,I really appreciate the time and effort that has been spent to make it easy for developers tobegin developing VoiceXML applications with minimal effort. It’s obvious that the folks at Nuanceknow their stuff and have gone the extra mile to provide a complete set of tools that will helpVoiceXML developers recognize a significant productivity gain over a simple text editor. Keep up thegood work guys.
About Jonathan Eisenzopf
Jonathan is a member of the Ferrum Group, LLC based in Reston, Virginia that specializes in Voice Web consulting and training. He has also written articles for other online and print publications including WebReference.com and WDVL.com. Feel free to send an email to eisen@ferrumgroup.com regarding questions or comments about the VoiceXML Developer series, or for more information about training and consulting services.