Speech Authentication Strategies, Risk Mitigation, and Business Metrics
The telephone was invented more than 150 years ago, and continues to be a very important means for us to communicate with each other. The Web, by comparison, is very recent, but has rapidly become a competing communications channel. The convergence of telecommunications and the Web is now bringing the benefits of Web technology to the telephone, enabling Web developers to create applications that can be accessed via any telephone, and allowing people to interact with these applications via speech and telephone keypads; that is, via voice browsers. See Appendix 1 for a brief discussion of voice browsers, including a toll-free telephone number where you can try one out for yourself.
Economic benefits are driving the current high level of interest in the use of voice authentication to secure these converged applications. This technology is saving companies a good deal of money when, for example, call center agents don't have to handle PIN reset requests or ask customers questions to confirm their identities. Password resets are the second most common reason workers call help desks, accounting for about one in four help desk requests, according to the Gartner Group, an IT research company. At an average cost of $22 USD per call, according to Gartner, that adds up fast, especially for large-scale enterprises and midsize organizations.
Now, applications can use the user's voice itself for authentication. The problem is choosing the level of security that best fits your needs. This article will examine many of the performance and cost vs. level-of-security issues that you should consider before making this choice. And, as with any kind of application, this choice should be made after executive-level decision makers are polled on the worth to the organization of the information to be exposed by the application.
First, I'll look at how voice authentication compares to traditional methods as well as to other biometric methods of authentication. Then, with many of the shortcoming of voice authentication in full view, I'll outline some of the rather compelling arguments behind its wide-spread adoption.
I'll conclude with a brief discussion of three offshoots of basic speech authentication:
- How Nuance's Voice Platform, a complete VoiceXML enterprise speech solution, provides real-time and other reports for the determination of system performance, ROI, and so forth.
- How the RSA division of EMC, a pioneering and still solid purveyor of security technology, is using voice authentication, in combination with other security measures, for the mitigation of risk.
- How some investigators are combining speech authentication, business process management (BPM) and Service Oriented Architecture (SOA).
When passwords are guessed or stolen, you usually aren't aware of the loss at the time. On the other hand, with smart cards and the like, the likelihood of a scam is reduced. To break into an account, someone would have to steal your smart card, but then you'd be more likely to know of the loss in time to do something about it.
Although smart cards, plastic cards that carry password and identity data for digital authentication, and biometrics, be it identification of someone's voice, fingerprint, iris or facial features, have taken off, most people still rely on passwords to gain access to bank accounts and computer databases.
Biometrics have an advantage over passwords and tokens in that they can't be forgotten, although they too can be lost. (People can lose fingers in an accident, or temporarily lose their voices due to illness.) But, biometrics can't be changed. If someone loses a key or an access code, it's easy to change the lock or combination and regain security.
So, if someone steals your biometric—perhaps by surreptitiously recording your voice or copying the database with your electronic iris scan—you could be stuck. Your iris is your iris, period. The problem is, while a biometric might be a unique identifier, it is not a secret. You leave a fingerprint on everything you touch, and someone can easily photograph your face.
Within the aforesaid glut of options, each with its own obvious shortcomings, there are some very compelling arguments that favor voice authentication for certain applications.
In large-scale, mainstream consumer applications such as banking, brokerage, or telecommunication services, having an authentication method that can be used by all customers from their home, office or car is a critical requirement. One of the main problems with finger, iris, or facial biometrics is the inconvenience and significant investment required for scanners and other hardware devices.
Because voice authentication is completed over a telephone with no additional hardware or software required for the end user, it is the only biometric that can be implemented today for an entire customer base. Voice authentication can, of course, also be established by a user and his or her speech-enabled PC or handheld device.
Voice authentication is based on an analysis of the vibrations created in the human vocal tract. The shape of a person's vocal tract determines the timbre and resonance of the voice, and everyone's vocal tract is fairly distinct in shape and size. Thus, just as B flat on a French horn and a piano sound different, different vocal tracts will produce particular sounds.
Critics of voice authentication point out that identical twins may pass for each other, but in most cases, imposters fail. The most variable factor in many voice authentication systems is the quality of the microphone and phone line.
According to some university-based studies, a voice authentication solution can be set up to offer security with a less than a 0.1% false accept rate (in other words, impostors being able to break into a system) even when the impostor has the correct password information. In a conventional password system, given that an impostor has the correct password, the impostor has a 100% chance of breaking into the system. Voice authentication, therefore, offers around 1000 times more security than a conventional password system ... in this particular comparison.
Although now a few years out of date, a fairly rigorous scientific evaluation conducted on speaker verification technologies on behalf of the Australian Government is available at Reference 2.
During a brief enrollment process, the typical voice-authentication package might require that callers speak their ID and password. Speaker verification systems capture and analyze the speech to create a voiceprint that is stored in the system database. Voiceprints are not audio samples, but a matrix of numbers that measures behavioral characteristics of the way the person speaks, as well as physical characteristics of the person's vocal tract. During verification, the caller speaks a password, which is then compared and scored against the voiceprint database.
Nuance speaker verification products, regarded by many industry analysts as the best, perform well even with transient voice changes (caused by colds, different emotional states, or background noise) and deliver consistent performance over wireline, wireless, or VoIP channels. And, these products also offer safeguards against recordings played over the telephone, or impressionists mimicking a user's voice.
Nuance's SpeechSecure can be used to deliver personalized service by verifying a caller based on an initial greeting. For example, if caller says, "Hello, it's me," the system might identify the caller, greet her by name, and then deliver customized or preferred services without further prompting.
And, SpeechSecure can use secret pass phrases to deliver security. Pass phrases are user-chosen phrases, in any language, that the caller must repeat to gain access to information. In this way, SpeechSecure doesn't simply calculate voice similarity, it ensures that the caller knows what to say.