January 17, 2021
Hot Topics:

Introduction to SALT (Part 3): Applying SALT

  • By Hitesh Seth
  • Send Email »
  • More Articles »

The speech recognition component (popularly referred to as Advanced Speech Recognition (ASR)) is focused on recognizing spoken user utterances and matching them to a list of possible interpretations using a specified grammar. The speech synthesis component (popularly referred to as Text to Speech (TTS)) is focused on dynamically converting text messages into voice output.

The telephony integration component is focused on connecting the speech platform with the world of telephones--the Public Switched Telephony Network (PSTN). This is typically achieved using telephony cards from vendors such as Intel/Dialogic connected via analog/digital telephony lines with your telephony provider (i.e. your phone company).

When multimodality is used, the regular web application delivery framework (based on TCP/IP/HTTP/HTML/JavaScript etc.) is used for delivering the web application. The speech/telephony platform is used for the "speech/voice" aspect of the whole interaction, depending on the nature of the connection and the location of the speech recognition/synthesis components. Of course, both of these interactions can happen together seamlessly, as part of the same user session, depending on the users choice.

.NET Speech SDK

You might be wondering where .NET Speech SDK fits in? The current preview which is available from Microsoft's site has really two components: (a) an add-in for Microsoft Internet Explorer which recognizes SALT tags and allows the user to interact with the application using the desktop's microphone and speakers/headphones and (b) a set of ASP.NET based Speech controls which allow developers using Microsoft Visual Studio .NET to create multimodal/telephony applications and/or add speech interactivity to existing web applications developed using Microsoft .NET and ASP.NET framework.

I would like to point out that it is quite possible that a SALT-based application could be delivered using a non-ASP.NET web application framework (e.g. Perl or Java Server Pages). What the .NET Speech SDK provides is really the ease of development in adding speech to your existing web applications or creating new applications.

To be Continued

We will continue our exploration of SALT in the next article. We will actually start developing a SALT-based multimodal and telephony application using Microsoft .NET Speech SDK, an extension to Microsoft Visual Studio .NET that is focused around building dynamic speech applications that are based on the SALT specification. You might want to get prepared by ordering the .NET Speech SDK Beta from the Microsoft site (a link is provided below).


About Hitesh Seth

A freelance author and known speaker, Hitesh is a columnist on VoiceXML technology in XML Journal and regularly writes for other technology publications on emerging technology topics such as J2EE, Microsoft .NET, XML, Wireless Computing, Speech Applications, Web Services & Enterprise/B2B Integration. He is the conference chair for VoiceXML Planet Conference & Expo. Hitesh received his Bachelors Degree from the Indian Institute of Technology Kanpur (IITK), India. Feel free to email any comments or suggestions about the articles featured in this column at hks@hiteshseth.com.

Page 2 of 2

This article was originally published on November 7, 2002

Enterprise Development Update

Don't miss an article. Subscribe to our newsletter below.

Thanks for your registration, follow us on our social networks to keep up-to-date