VoiceImplementing CTI Using the Microsoft Speech Server on CRM/Contact Center Environment

Implementing CTI Using the Microsoft Speech Server on CRM/Contact Center Environment

Developer.com content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

Overview

The CTI (Computer Telephony Integration) technology has been widely used in the contact center application environment. CTI essentially ensures the technique combination between voice and data on different vendors’ phone switches. CTI derives a contact center to implement significant effective and efficient customer service. Figure 1 is a typical architecture of traditional CTI in a contact center environment.

Figure 1: A typical traditional architecture of CTI integrating with IVR in a Call Center

The MS Speech Server 2004 launched in March of this year and has started shipping to customers. Figure 2 shows how the MS Speech Server and speech application work. The big different between the MS speech application and traditional IVR is that the MS Speech Server-based application works on the .NET Web framework with SALT (Speech Application Language Tags) architectures.

Figure 2: How MS Speech Server and Speech Application work

Speech Engine Services (SES)

The SES and TAS can reside on either the same server or separated servers. Speech Engine Services (SES) provides engine services on the server side, including a speech recognition engine, prompt engine, and TTS engine for telephony customers through TAS. The Speech Recognition Engine component works on handing a caller’s speech input; the Prompt engine joins prerecorded prompts from a prompt database and plays them back to the caller; the TTS engine works Text-to Speech to synthesize audio output from a text string using Scansoft engine. Simply specking, the SES plays a voice or speech browser role to interact with the Web server, which looks like the IE in a regular Web application.

Telephony Application Services (TAS)

Telephony Application Services (TAS) serves as a connection proxy between PBXs and SES by managing a set of SALT interpreters, controllers, and telephony interface managers. TAS is comprised of both telephony hardware and a software interface. So far, the telephony hardware that can work within TAS are Intel Dialogic D41JCT, DM/V480, and DM/V960, which have 4, 48, and 96 voice ports. TAS works with third-party Telephony Interface Manager (TIM) software; right now, Intel NetMerge CallManager and InterVoice TIM exist.

Web Server

Web Server contains MS IIS and ASP.NET Speech Controls. Also, your speech application developed by the MS Speech SDK will be deployed and run on a Web server. Web Server will dynamically interact with TAS/SES and load Web pages when your speech application is running. It also can connect with database, CRM, contact center and CTI server, third party server, and so forth in a real-world application.

The MS Speech Server does not support CTI directly so far; you have to utilize third-party CTI products to implement CTI features and integrate with MS Speech Server. The following sections describe the integration between MS Speech Server and Intel CPS CTI, Genesys CTI, and Cisco ICM CTI.

Intel CPS CTI Integration with MS Speech

Figure 3 shows a typical architecture of data and voice between a MS Speech Server integrating with Intel CPS-Call Processing Software (formerly Intel CT Connect).

Figure 3: An architecture of Intel CPS integrating with MS Speech Server

  1. A caller calls an MS Speech Server-based speech application. The inbound call is answered by MS Speech Server (TAS/SES).
  2. The speech application developed by MS SDK retrieves call information and CTI messages such as port number of telephony card, ANI, DNIS, and so forth from the AnswerCall Control of the MS Speech SDK.
  3. The speech application looks up Intel CPS CIM (Call Information Manager) based on the extension and DNIS. DNIS will be associated with different service levels or any other logic you want, and the subsequent prompts will be played to the caller.
  4. The caller interacts with the speech application either by speech or DTMF by dynamic SALT on the Web server. Basically, the caller ultimately may need to talk to a live service agent. This means the application will transfer the current call to an appropriate agent extension.
  5. The speech application retrieves the appropriate destination queue from CPS CIM where the current call will be transferred to or routed to, according to your requests that have been set by business logic. For example, the call will be routed to different phone queues by different service levels.
  6. The speech application routes the current call to a specific phone queue of PBX, then going to an extension of agent when an agent is available.
  7. When a call comes in, the desktop application will identify the call and pop up the caller’s data automatically through looking up the CPS CIM database and enterprise database using ADO.NET. Wrap up data and interact with CRM or contact center applications on the desktop.

Genesys CTI Integration with MS Speech

Figure 4 shows a typical work flow of data and voice between a MS Speech Server integrating with Genesys CTI.

Figure 4: An architecture of Genesys CTI integrating with MS Speech Server

  1. A caller calls an MS Speech Server-based speech application. The inbound call is answered by MS Speech Server (TAS/SES).
  2. The speech application developed by MS SDK retrieves call information and CTI messages, such as the port number of the telephony card, ANI, DNIS, and so forth from AnswerCall Control of the MS Speech SDK.
  3. The caller interacts with the speech application either by speech or DTMF by dynamic SALT on the Web server. Basically. the caller ultimately may need to talk to a live service agent. This means the application will transfer current call to an appropriate agent extension.
  4. The speech application running on the Web server will connect with Genesys CTI through Genesys T-Server and send a request of the call router reference by ANI/DNIS and so on and the associated caller data to Genesys Interaction Router.
  5. Genesys CTI looks up the enterprise database going to the identifying caller and retrieves related data and the appropriate phone queue where the current call will be routing to according to your requests that have been set by business logic. For example, the call will be routed to different phone queues by different caller IDs.
  6. Genesys CTI responds that the data has been retrieved to the speech application.
  7. The speech application routes the current call to a specific phone queue of the PBX, and then goes to an extension of an agent when the agent is available.
  8. When a call comes in, simultaneously the desktop application will identifier the call and pop up the caller’s data. If necessary, the agent can retrieve the caller’s full info via looking up the enterprise database. You can interact with CRM or contact center applications on the desktop.
  9. If you need to, you can implement conference, outbound, callback, and so forth on the desktop by using Genesys CTI features.

Cisco ICM CTI Integration with MS Speech

Figure 5 describes a typical work flow of data and voice between a MS Speech Server integrating with Cisco ICM CTI.

Figure 5: An architecture of Cisco ICM integrating with MS Speech Server

  1. A caller calls an MS Speech Server-based speech application. The inbound call is answered by MS Speech Server (TAS/SES).
  2. The speech application developed by MS SDK retrieves the call information and CTI messages, such as the port number of the telephony card, CLID, DN, CED, and so forth from AnswerCall Control of the MS Speech SDK.
  3. The caller interacts with the speech application either by speech or DTMF by dynamic SALT on the Web server. Basically, the caller ultimately may need to talk to a live service agent. This means the application will transfer the current call to an appropriate agent extension.
  4. The speech application running on the Web server will send a call routing request to Cisco PG (Peripheral Gateway). PG forwards this request to the Call Router of the Cisco ICM by service control interface (SCI) protocol. The PG will interact with CTI Server. Basically, the ICM CTI server can either reside in PG or work in a separate box.
  5. Cisco ICM runs an appropriate routing script and looks up enterprise database going to the identifying caller and retrieves related data; then, it creates a destination of the call or/and the appropriate ACD queue where the current call will be routed according to your requests that have been set by business logic. For example, the call can be routed to different phone queues by different CLIDs.
  6. Cisco ICM responds that the call destination has been created to the speech application via Cisco PG.
  7. The speech application routes the current call to a specific phone queue of PBX/ACD; then goes to an extension of an agent when an agent is available.
  8. When a call comes in, simultaneously the Cisco CTI OS desktop application or a customized desktop application will identify the call and pop up the caller’s data. If necessary, the agent can retrieve the caller’s full info via looking up enterprise database. You can interact with CRM or contact center applications on the desktop.
  9. If you need to, you can implement or use conference, outbound, web chat, callback and delayed callback, and so forth or softphone on the desktop using Cisco ICM CTI OS features.
  10. The Cisco ICM is powerful in its functionality of call routing and CTI. You can integrate Speech Server with ICM in different architectures depending on your business logic. For example, you can place Speech Server before or behind PBX/ACD. Also, you can allow or not allow Speech Server to interact with Cisco PG and ICM. So, under different architectures, you will have different work flows.

Implement CTI Features Using Vendors Programming API

In preceding sections, I described how to integrate MS Speech Server with Intel, Cisco, and Genesys CTI in applications or on a database level. Actually, you can use the CTI vendor’s programming API to develop and customize complicated CTI features and integrate with MS Speech Server based on the MS .NET Framework. Because all APIs of Intel, Cisco, and Genesys provided the unmanaged components based on C/C++/Java, when you want to use them in a .NET development environment, you have to do some additional jobs. I will describe how to implement CTI using unmanaged programming API in the next article.

Conclusion

MS Speech Server can seamlessly integrate with different vendor’s CTI products on different level. Also, the MS speech application SDK can provide a powerful tool to help customers implement speech IVR and integrate CTI quickly.

About the Author

Xiaole Song is a professional on designing, integrating and consulting CTI, Contact Center, IVR, IP Telephony, CRM and Speech application. He has performed various roles for Intel, Dialogic and Minacs, etc. Feel free to email any comments about article or consulting services at xiaole_song@yahoo.com.

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Latest Posts

Related Stories