Question: I'm responsible for researching conversion of touchtone-based IVR system to speech recognition. The application involves a fairly complex menu structure and currently requires our customer callers to enter dates, inventory and invoice numbers, locations, and more. We have been told that speech recognition really works now and would greatly increase caller satisfaction. We'd like to believe this, but we're a bit nervous about the downsides of such a change, given the costs as well as the potential caller backlash. Any suggestions or caveats in this evaluation process would be appreciated.
- From William S. via email
Dr. C: William, you have chosen the right time to consider this change. Solid advances in speech recognition technology in the last two years make speech-rec-enabled IVR very attractive now. For one, the core recognition engines are now phonetic grammar-based, which gives you an effective unlimited vocabulary in your application. With phonetic grammars, you no longer have to go through the laborious and expensive process of training the IVR system with thousands of live caller samples just to generate recognition vocabularies. Now you simply create a text segment that lists all the possible words that can be recognized at each menu point in the application. The recognition engines build grammars and compare what is spoken with those text words. Very cool.
The other significant technological improvement is that standardized application tools now make the development process much faster. Specifically, VoiceXML has become a very popular platform-independent language. Developing VoiceXML-based applications is faster and you get the added benefit that your application is portable to any standards-based IVR server, speech recognition engine, or text-to-speech engine.
So if now is a good time to convert your IVR to use speech recognition, how should you choose your vendors? First, experience matters. The more successful deployments a vendor has under its belt, the better chance you will get an accurate estimate of time and costs. So, you need to get permission to talk directly with several customers who have gone through that process. I would ask for a few references whose projects have been completed in the last 30 days, and a few that were completed more than 6 months ago to see how well they have been supporting, updating, and tuning the system. Ask these references:
- How long did the various phases of the project take?
- Were the initial cost and time estimates met, and if not, why?
- Did the project include an analysis of the best way to implement a voice-controlled user interface (VUI) before any actual coding was started?
These questions can give you better insight into how well the vendor understands and is able to execute the process. However, you must also know what it takes to test your "finished" SR-based solution before it goes live. Without this, you could very well stand the chance of launching a new service that falls flat on its face with worse performance and call-handling statistics than your old touch-tone based system. Such failures are a shame given the incredible potential of SR.
In the coming months this column will explore what it takes to effectively test and verify SR-based applications, broken down into the following sections:
Real-world performance issues. Here we will discuss the demands of diverse caller demographics (accents), caller devices (cell phones, desk phones, etc), noise environments, multi-line call loads, barge-in versus non-barge-in, and IVR platform resources.
SR performance metrics - like recognition accuracy rate - that should be directly or indirectly derived from an effective SR-based system performance test.
Developing effective tests for SR-based applications. This section will discuss the various ways to architect a test that analyzes performance of the system before it goes live. This will include a discussion of things you can do to get the most from your precious testing dollar.
Chris Bajorek is co-founder of CT Labs, an independent full-service converged communications and IP Telephony product testing and certification lab. He can be reached at cbajorek@ct-labs.com.