VoiceSALT (Part II): Elements of SALT

SALT (Part II): Elements of SALT

Developer.com content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

In the second part of our “SALT” series we will endeavor to understand the syntax of the key SALT elements and their usage.


The <prompt> element is the main element for speech-output. It allows
SALT-based applications to play prompts using TTS (Text to Speech Synthesis)
or pre-recorded audio prompts. The SALT 1.0 Specification requires SALT Browsers
to support W3C SSML (Speech Synthesis Markup Language), however other TTS
markup languages may be supported as well.

    <prompt id=”” bargein=”” prefetch=””
        — content of the prompt —

Prompts are queued and played back by the SALT Applicaiton using the prompt_id.Queue()
and prompt_id.Start() methods.


  1. Simple TTS Prompt:
    <prompt id=”Welcome”>
    Welcome to your SALT application. What would you like to do today?
  2. Pre-recorded Audio:
    <prompt id=”RecordedPrompt”>
        <content href=”welcome.wav”/>
  3. Prompt using external SSML (Speech Synthesis Markup Language) Content:
    <prompt id=”ExternalContent”>
         <content href=”Welcome.ssml” type=”application/ssml+xml”/>
  4. Dynamic Prompt:
    <prompt id=”DynamicPrompt”>
        Did you say <value targetelement=”txtOption” targetattribute=”value”/>?


The <grammar> element is used within <listen> elements (described
later) for specifying input grammars. A speech grammar specifies a set of
utterances that a user may speak to perform an action. SALT Browsers are
required to support an XML form of W3C SRGS (Speech Recognition Grammar Specification).
Additionally, other grammar formats may be also supported as well.

    <grammar name=”” src=”” type=”” xml_lang=””>
        — content of grammar (when
inline grammar is used —



  1. Using Inline Grammars:
    <grammar >
            — line grammar in SRGS format

  2. Using external Grammars
    <grammar src=”Employees.grxml” type=”application/srgs+xml”/>


The <record> element is used within <listen> elements (described
later) for speech recording. One <listen> can have only <record>

    <record type=”” beep=””/>

Examples of the <record> element are illustrated with the examples of the <listen> element.


The <bind> element is used within <listen> elements (described later) for
processing results from speech recording or speech recognition.

    <bind targetelement="" targetattribute="" test=""

Examples of the <bind> element are illustrated with the examples of <listen> element.


The <listen> element is used for speech recognition and/or audio
recording. When it is used for speech recognition, the element itself contains
<grammar> elements which identify the various alternative inputs that the
platform must recognize. Similarly, when the element is used for speech
recording it contains a <record> element to manage the recoding process. Once
the results have been obtained, a <bind> element can be used to process the

    <listen id=”” initialtimeout=”” babbletimeout=””
maxtimeout=”” endsilence=”” reject=”” xml_lang=”” mode=””>
        — content of listen (record/grammar
& bind) —


When <listen> is used for speech recognition, Start(), Stop() and
Activate() methods are used for starting, stopping and activating the grammars
respectively. However, when used for speech recording, Start() and Stop()
are used for starting/stopping the recording process.


  1. Using the <listen> element for Speech Recognition. The grammar is
    specified by the external URL Employee.grxml, and once recognized the “//employee_name”
    is binded to the input element ‘txtName”.
    <listen id=”listenEmployeeName”>
         <grammar src=”MyGrammar.grxml”/>
         <bind targetelement=”txtName” value=”//employee_name”/>
  2. Using the <listen> element for Speech Recognition. The grammar is
    specified by the external URL Employee.grxml, and once recognized the function
    processEmployeeName() is called which assigns values to the proper fields.
    <listen id=”listenEmployeeName” onreco=”processEmployeeName”>
         <grammar src=”MyGrammar.grxml”/>
        function processEmployeeName() {
  3. Using <listen> for voice recording:
    <listen id=”recordMessage” onreco=”processMessage”>
        <record beep=”true”/>
        function processMessage() {


The <dtmf> element is used in telephony applications for recognizing DTMF
(Dual Tone Multi-Frequency) inputs (inputs from a touch tone keypad). The usage
is very similar to that of the <listen> element (when it is used for

    <dtmf id="" initialtimeout="" interdigittimeout=""
endsilence="" preflush="">
        — content of —

Start() and Stop() methods are used on-top of DTMF objects to start and stop
the DTMF collection process. Once DTMF recognition is complete the event onreco
is fired.


  1. <dtmf id="dtmfPIN">
        <grammar src="PIN_Grammar.grxml"/>
        <bind value="//pin" targetelement="txtPIN"/>
  2. <dtmf id="dtmfPIN" onreco="processPIN">
        <grammar src="PIN_Grammar.grxml"/>
        function processPIN() {


The <smex> element, which stands for Simple Message Extension, communicates
with platform specific functionality such as logging/telephony call control,
etc. It is also a means by which SALT platform implementations can introduce new
(however, platform-specific) functionality that is not included in the current
SALT Specification as well. <smex> element may have <bind>, <param> as its child

    <smex id="" sent="" timer="">
        — smex content—

The onreceive() event is fired once the SALT Browser receives a platform message.

To be Continued

We will continue our exploration of SALT in the next article by previewing
Microsoft .NET Speech SDK–a toolkit that is focused around building dynamic
speech applications using SALT.


About Hitesh Seth

A freelance author and known speaker, Hitesh is a columnist on VoiceXML
technology in XML Journal and regularly writes for other technology publications
including Java Developer’s Journal, Web Services Journal and The Computer
Bulletin on emerging technology topics such as J2EE, Microsoft .NET, XML,
Wireless Computing, Speech Applications, Web Services & Enterprise/B2B
Integration. He is the conference chair for VoiceXML Planet Conference
& Expo
. Hitesh received his Bachelors Degree from the Indian Institute
of Technology Kanpur (IITK), India. Feel free to email any comments or suggestions
about the articles featured in this column at hks@hiteshseth.com.

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Latest Posts

Related Stories