Events Training Consulting Newsletters Webcasts Blogs
Subscriptions
Current Issue
Past Issues
Join Our Mailing List
Contact Us
Home
 
 
 

 


TechEncyclopedia

Making Better Voice Portal Apps, Part 2

Develop a usable and accurate carrier-grade VoiceXML application.

By Chris Bajorek

print this article print this article
email this article e-mail this article
.

Contact Center Management for Maniacs
Speech Makes Inroads As A Service
Q&A: Authentication Technology for Call Center Security
Why Mobility Will Change Telephony
How To Increase Employee Contribution
Aspect and Microsoft Create Speech Application
Measuring The Things That Matter
Assess Agent Skills Before the Hire
Where are your next 1,000 workers coming from?
Call Center Spotlight: Virgin Mobile Canada
.

01/07/2002, 10:04 AM ET

Last month we asked the question, "How easy is it to create good VoiceXML applications?" I enlisted the help of Todd Elvins, Ph.D., co-founder of Indicast (www.indicast.com), a company that provides private-label voice portal services. His company writes lots of VoiceXML programs and is uniquely qualified to dispense this kind of help. Todd has graciously passed on his "top ten" list of VoiceXML app-dev design hints. Let's continue with that list.

Tune your grammars. Applications written to the VoiceXML standard can simply list the text of words ("grammars") that should be recognized as voice commands at any given recognition event. While this will achieve a base level of ASR performance, you need to tune those grammars to get the highest accuracy. This involves (1) including multi-syllabic commands and phrases and their synonyms, (2) specifying alternate pronunciations of commands, (3) using voice commands that are most dissimilar, and thus easier for the recognizer to distinguish, (4) adding command utterances that include extraneous words like "um," "please," "eh," etc., and (5) assigning probabilities for each word or phrase.

Tune the ASR engine using live utterances. This step takes effort, but can improve recognition accuracy. Start by collecting thousands of live caller command utterances from several hundred individual speakers. Transcribe them so the actual words spoken are known for each command sample. Now you have samples gathered from a variety of callers, under a variety of call and noise environments.

Play these samples through the recognizer, using the transcriptions to identify the command spoken. Use provided tools to adjust the ASR parameters. The goal is to minimize false negatives and false positives. Repeat this process until peak ASR accuracy is achieved. This step may also reveal problems with the words and phrases in the grammar. Todd suggests re-collecting command utterances and transcribing them if your application changes. You may need help from the ASR vendor the first time you do this.

Build a VoiceXML generator. You may want to write your first application in static VoiceXML for debugging and usability testing. A next step might be making database queries from within static VoiceXML code, yielding a more dynamic application that branches based on database content. However, a program that actually generates VoiceXML code just-in-time may be the smartest course. A VoiceXML generator can be modeled after an HTML-generating middleware package, with some of the same modules "tweaked" for this new purpose. Todd suggests the Apache web server with the appropriate plug-in modules as a good starting point.

Buy a VoiceXML interpreter and platform. There is no need to write your own VoiceXML interpreter today; multiple vendors sell relatively mature products in this area. When selecting a vendor, carefully evaluate the interpreter and platform's scalability and compliance with the VoiceXML 2.0 specification. Depending on your app, you may want to install and maintain your VoiceXML platform at a carrier-grade co-location facility. Be sure to include failover procedures for your VoiceXML generator, database, and VoiceXML platform. And if you want to run your VoiceXML application on a third-party host, a number of companies are in that business (see www.voicexml.org).

Verify alternate pronunciations from your grammar compiler. The ASR grammar compiler is a powerful tool and should add alternate pronunciations to the words in the grammar being compiled. These are sometimes kept in a dynamic grammar database.

Develop a rigorous test suite. You may want to consider getting a test platform such as the Empirix Hammer (www.empirix.com), a scriptable call generator that can inject large numbers of calls into the voice platform. The Hammer can be programmed with an exhaustive functional test that exercises virtually any path through a voice application. Run this test suite whenever application code changes.

Developing a usable and accurate carrier-grade VoiceXML application takes much time and effort. After following Todd's steps, your application should at least be ready for "friendly" users. Then use their feedback to tune further.

Chris Bajorek is co-founder of CT Labs, an independent full-service converged communications and IP Telephony product testing and certification lab. Chris can be reached at cbajorek@ct-labs.com.


.

Free CallCenter Insider Newsletter

Your Email Address


Optional Areas of Interest
International News
Advice/Tips
Technology
Agent Development
IVR

 

ICMI - Making Better Voice Portal Apps, Part 2
Events Training Consulting Newsletters Webcasts Blogs
Subscriptions
Current Issue
Past Issues
Join Our Mailing List
Contact Us
Home
 
 
 

 


TechEncyclopedia

Making Better Voice Portal Apps, Part 2

Develop a usable and accurate carrier-grade VoiceXML application.

By Chris Bajorek

print this article print this article
email this article e-mail this article
.

Contact Center Management for Maniacs
Speech Makes Inroads As A Service
Q&A: Authentication Technology for Call Center Security
Why Mobility Will Change Telephony
How To Increase Employee Contribution
Aspect and Microsoft Create Speech Application
Measuring The Things That Matter
Assess Agent Skills Before the Hire
Where are your next 1,000 workers coming from?
Call Center Spotlight: Virgin Mobile Canada
.

01/07/2002, 10:04 AM ET

Last month we asked the question, "How easy is it to create good VoiceXML applications?" I enlisted the help of Todd Elvins, Ph.D., co-founder of Indicast (www.indicast.com), a company that provides private-label voice portal services. His company writes lots of VoiceXML programs and is uniquely qualified to dispense this kind of help. Todd has graciously passed on his "top ten" list of VoiceXML app-dev design hints. Let's continue with that list.

Tune your grammars. Applications written to the VoiceXML standard can simply list the text of words ("grammars") that should be recognized as voice commands at any given recognition event. While this will achieve a base level of ASR performance, you need to tune those grammars to get the highest accuracy. This involves (1) including multi-syllabic commands and phrases and their synonyms, (2) specifying alternate pronunciations of commands, (3) using voice commands that are most dissimilar, and thus easier for the recognizer to distinguish, (4) adding command utterances that include extraneous words like "um," "please," "eh," etc., and (5) assigning probabilities for each word or phrase.

Tune the ASR engine using live utterances. This step takes effort, but can improve recognition accuracy. Start by collecting thousands of live caller command utterances from several hundred individual speakers. Transcribe them so the actual words spoken are known for each command sample. Now you have samples gathered from a variety of callers, under a variety of call and noise environments.

Play these samples through the recognizer, using the transcriptions to identify the command spoken. Use provided tools to adjust the ASR parameters. The goal is to minimize false negatives and false positives. Repeat this process until peak ASR accuracy is achieved. This step may also reveal problems with the words and phrases in the grammar. Todd suggests re-collecting command utterances and transcribing them if your application changes. You may need help from the ASR vendor the first time you do this.

Build a VoiceXML generator. You may want to write your first application in static VoiceXML for debugging and usability testing. A next step might be making database queries from within static VoiceXML code, yielding a more dynamic application that branches based on database content. However, a program that actually generates VoiceXML code just-in-time may be the smartest course. A VoiceXML generator can be modeled after an HTML-generating middleware package, with some of the same modules "tweaked" for this new purpose. Todd suggests the Apache web server with the appropriate plug-in modules as a good starting point.

Buy a VoiceXML interpreter and platform. There is no need to write your own VoiceXML interpreter today; multiple vendors sell relatively mature products in this area. When selecting a vendor, carefully evaluate the interpreter and platform's scalability and compliance with the VoiceXML 2.0 specification. Depending on your app, you may want to install and maintain your VoiceXML platform at a carrier-grade co-location facility. Be sure to include failover procedures for your VoiceXML generator, database, and VoiceXML platform. And if you want to run your VoiceXML application on a third-party host, a number of companies are in that business (see www.voicexml.org).

Verify alternate pronunciations from your grammar compiler. The ASR grammar compiler is a powerful tool and should add alternate pronunciations to the words in the grammar being compiled. These are sometimes kept in a dynamic grammar database.

Develop a rigorous test suite. You may want to consider getting a test platform such as the Empirix Hammer (www.empirix.com), a scriptable call generator that can inject large numbers of calls into the voice platform. The Hammer can be programmed with an exhaustive functional test that exercises virtually any path through a voice application. Run this test suite whenever application code changes.

Developing a usable and accurate carrier-grade VoiceXML application takes much time and effort. After following Todd's steps, your application should at least be ready for "friendly" users. Then use their feedback to tune further.

Chris Bajorek is co-founder of CT Labs, an independent full-service converged communications and IP Telephony product testing and certification lab. Chris can be reached at cbajorek@ct-labs.com.


.

Free CallCenter Insider Newsletter

Your Email Address


Optional Areas of Interest
International News
Advice/Tips
Technology
Agent Development
IVR