February 28, 2021
Hot Topics:

VoiceXML Developer Series: A Tour Through VoiceXML, Part VII

  • By Jonathan Eisenzopf
  • Send Email »
  • More Articles »

In the last two editions of the VoiceXML Developer, we learned how to create VoiceXML grammars in both GSL and JSGF formats. In this edition of the VoiceXML Developer, we're going to learn how to record and playback speech and how to transfer callers to another phone number.

Record speech with the <record> element

The <record> element records spoken input and assigns the contents to a VoiceXML variable defined by the name attribute.

<record name="caller_name" beep="true" maxtime="10s"
    finalsilence="2000ms" type="audio/wav" dtmfterm="true" />

The beep attribute determines whether an audible tone is played before the gateway begins recording. Most people are used to hearing a tone on answering machines and voice mail systems as a signal to begin speaking. By default, this is set to false.

The maxtime attribute specifies the maximum number of seconds to record input. The system will automatically stop recording when this value is reached if the user hasn't stopped speaking or hasn't otherwise indicated that they've completed the recording.

The finalsilence attribute sets the number of milliseconds of silence that will signal the system to stop recording input. If you set this to a value that is too small, the system might stop recording when the speaker pauses between a sentence or takes a breath, so be careful cowboy.

The type attribute contains the mime type for the audio format that the recording will be saved to. The supported formats will differ based upon the VoiceXML gateway platform you're using, however, the audio/wav format should be standard on most if not all VoiceXML platforms.

When the dtmfterm attribute is set to true, the system will stop recording input when it hears a DTMF tone. This can be any button on a standard telephone keypad. It can be used instead of or in addition to the finalsilence attribute, which stops recording input when it hears a pause.

Because the <record> element is essentially a form field that contains recorded audio input rather than text, it can contain prompts and event handlers. The example below collects two recordings, first the customer's name, then their message. These recordings are then sent to a back end Perl script for processing.

Click here to see example 1

On lines 8-11 in the example above, we're recording the customers name. If we don't get any input, the noinput event is triggered and the <noinput> element on line 10 is called, which reprompts the user. Once we have the customer's name, we record their emergency on lines 12-15. We submit the recordings to a script with the <submit> element on line 18 and end the call.

Page 1 of 2

This article was originally published on October 7, 2002

Enterprise Development Update

Don't miss an article. Subscribe to our newsletter below.

Thanks for your registration, follow us on our social networks to keep up-to-date