February 27, 2021
Hot Topics:

VoiceXML 2.0 Grammars, Part I

  • By Jonathan Eisenzopf
  • Send Email »
  • More Articles »

This technical series will provide programmers with a complete introduction to the VoiceXML 2.0 grammar format. In part I, we will discuss the XML and ABNF formats, as well as the structure and elements included in a VXML 2.0 document.


Grammars define the words and sentences (or touch-tone DTMF input) that can be recognized by a VoiceXML application. One big drawback of VoiceXML 1.0 was that it lacked a standard speech recognition grammar format. To some degree, this reduced the benefits of the specification because it left the burden on VoiceXML browser developers to define the grammar language and format. For example, application grammars written for Nuance Voice Web Server would have to be re-written to work on IBM Voice Server. This problem was rectified with the Speech Recongition Grammar Specification (SRGS) introduced by the W3C Voice Browser group in conjunction with the VoiceXML 2.0 specification.


The VoiceXML 2.0 grammar specification provides two text formats for writing speech recognition grammars: XML or ABNF. XML is a Web standard for representing structured data. Many programming and editing tools incorporate XML editing and processing capabilities. These XML tools can be used to write VoiceXML 2.0 grammars. ABNF stands for Augmented Bacus-Naur Form, and is a format used to specify languages, protocols and text formats. For example HTTP, the communications protocol used on the World Wide Web (and for VoiceXML applications), is specified in ABNF format.

The ABNF grammar format uses special characters to define grammar expressions in a text string while XML grammars are composed of text strings enclosed in XML elements. Whether to use the ABNF or XML format is up to you, however, VoiceXML 2.0 only requires implementers to support the XML format. Therefore, you may want to use the XML format to write grammars if portability is important to you.

If you're already experienced with the GSL or JSGF grammar formats, then you'll likely prefer the ABNF format because of its similarity. If you decide to use the XML format, you will quickly discover that it is extremely verbose compared to ABNF, making it more difficult to read. On the other hand, using the DTD or XML Schema for the XML grammar format in conjunction with an XML editor makes the task less tedious and reduces syntax errors. The authors of the VoiceXML 2.0 grammar format have also included an XSL style sheet for converting XML grammars to ABNF format, which may aid linguists who prefer to proof grammars in a less verbose text format.

Examples will be listed in both ABNF and XML format.

Page 1 of 2

This article was originally published on October 7, 2002

Enterprise Development Update

Don't miss an article. Subscribe to our newsletter below.

Thanks for your registration, follow us on our social networks to keep up-to-date