Avoiding the Speech Rec. Wreck
Speech recognition nightmares abound. It's not uncommon for implementations to break budgets, due to the high cost of professional services and licensed software applications associated with them. In fact, according to Art Schoeller, senior analyst at Yankee Group, speech applications cost about three to four times more to develop than touch tone versions. However, that didn't stop Hewlett-Packard, which, thanks to speech recognition, is cruising comfortably with improved customer satisfaction and fewer misrouted calls.
For HP, managing about one million pre- and postsales support calls every month for more than 2,000 consumer and commercial products proved logistically and financially daunting. The elaborate Nortel Networks Periphonics IVR system required a maze of navigational options. "Our menus became so complicated, because we have so many products," says Solveig Smith, internal consultant for consumer support delivery. Customers complained that if they selected an incorrect option during a call, they weren't sure how to back out of it, encouraging them to zero out of the system for a tech support agent who then would reroute callers to the most appropriate support queue. Spurred by its complex call flow, HP's main technical support number found its voice in April 2002, when the company implemented speech recognition functionality, incorporating Nuance's speech engine with Nortel's Periphonics platform.
HP, as a result, has decreased the time customers spend in its IVR system by an average of 33 percent, or 20 seconds to 40 seconds per call. Smith notes that HP can also handle potential misroutes more cost effectively. Within three months of deploying speech recognition customer satisfaction for time of access increased by 4 percent, customer satisfaction for ease of access increased by 3 percent, and the average estimated misroute run-rate percentage for select IVR applications dropped two percentage points to 4 percent. Smith notes that the speech system keeps misrouted customers from sitting on hold only to find that they have to be transferred.
HP is not the only organization to reap the benefits of calling on speech. Companies like Amtrak, Charles Schwab, and United Airlines are attracted to speech recognition's ability to offload more calls and in turn decrease costs while enhancing the level of service delivered. Most companies should expect at least a 10 percent to 25 percent increase in automation from touch tone to speech, according to Azita Martin, vice president of marketing at TuVox. That does not mean that touch tone or dual tone multifrequency (DTMF) applications have lost their relevance within the call center; they allow customers to respond to prompts via their phone keypads, which is more accurate. These solutions work best for resolving basic inquiries, such as checking account balances. DTMF, like speech, however, has its weaknesses. What follows are the pros and cons of both.
Touch Tone Traction
Customers are accustomed to the standard press-one-for-service, press-two-for-sales DTMF prompts. That familiarity level bodes well for American Savings Bank (ASB), Hawaii's third-largest financial institution and a subsidiary of Hawaiian Electric Industries, which turned to Intervoice for DTMF in 2002 and speech in 2004. "We've had touch tone for more than 20 years," says Renee Lum, assistant vice president and manager of the customer service center at ASB. "We have about 300,000 to 350,000 calls per month in the [touch tone] application. It's comfortable for our customers, because we had it way before online banking and way before speech."
Part of DTMF's strength is its ease of use in number-heavy tasks, such as entering account information. For example, the benefits unit of Ceridian, a provider of HR management outsourcing solutions, currently uses only touch tone IVR applications. Ceridian grew via acquisition, and has several disparate IVR platforms resulting from its growth. Ceridian is in the process of standardizing on version 2.2 of Interactive Intelligence's Customer Interaction Center (CIC) platform. The company is interested in deploying speech recognition, but Chris Foley, voice network analyst at Ceridian, says that touch tone's numeric nature is essential to its business. "Almost everything is numeric on entry, so the ability to speak things like letters and names isn't quite as pressing."
Numeric entry also makes touch tone easier to manage. "Setting up the touch tone menu is a lot easier than defining the grammars and dialogues for a speech application," Schoeller says. Additionally, DTMF is suitable for a clear-cut routing menu with two or three choices, according to Sheila McGee-Smith, president and principal analyst at McGee-Smith Analytics.
That, however, can also be a disadvantage to deploying DTMF solutions. The fewer the menu options the better it is for touch tone, but that limits what customers can do without live CSR help. Organizations that try to extend the number of menu prompts beyond a handful will create such a confusing and long call flow that customers may press the number on their keypad that contact center managers dread: zero. Daniel Hong, senior voice business analyst at Datamonitor, notes that DTMF is only as accurate as the user. "A lot of times people screw up keying in the numbers. That causes frustration and [they] just press zero."
Passing the Baton
Where touch tone applications fall short, speech recognition solutions pick up. Central to the success of speech recognition, however, is pinpointing what applications could benefit the most by turning to speech. Tasks like obtaining stock quotes or making address changes are difficult with touch tone applications, but easier for speech recognition. "When customers are in situations where they have to give something other than numeric information, they have to speak to a human. That is a drawback," says Jim Mitchell, manager of voice communications services at Ceridian, because it adds to call length and expense. "If your touch tone IVR applications are working fine, and you're getting a good proportion of customers willing to stay in self-service, there's no reason necessarily to change it," McGee-Smith says. "The thing to do with speech recognition is think about the applications that you thought about using touch tone for, figured out it would be too complex, and walked away from." If the IVR menu would be long or confusing, speech is probably the way to go.
With enhanced vocabulary recognition and improved natural language capabilities, speech recognition systems open up what customers can do, as well as what answers the systems can give. Directed dialogue steers the customer by asking questions such as "What is your order number?" The better the prompt is written, the better the customer response. Natural language dialogue, however, allows for unstructured responses, such as "I want to book a one-way flight from New York City on November 24 to Atlanta."
Although mostly attractive for their automation capabilities, speech recognition systems can also boost the customer experience. "Things like personalization and multichannel integration will be very important," says George Platt, senior vice president and general manager of Enterprise Business Unit at Intervoice. For example, ASB's Lum says the financial institution can brand the speech application much better than it would be able to brand the touch tone application. "We are using terms like aloha and mahalo," she says. "We are also able to put music tones into the application."
Speak (Not So) Easy
Speech recognition's disadvantages include issues with background noise and systems' abilities to recognize accents and colloquialisms, although applications are getting better. Speech recognition also requires much more work on the voice user interface (VUI) than would a touch tone system's menu, says Ken Waln, CTO for Edify. "People aren't as used to them and you need to handle cases where the system can't quite recognize what someone says, or says something that the system wasn't designed to recognize. The VUI design becomes a really critical point."
According to Marie Jackson, vice president of marketing at Edify, there are several elements that comprise a reliable speech solution, including a strong VUI, the workflow and business logic, and back end integration. "You [also] need the opportunity to do testing and fine tuning at the back end," she says. "It's probably one of the most important things."
Speech requires more effort, but the potential returns are huge, according to Steve Rutledge, vice president of product marketing at Genesys Telecommunications Laboratories. "The cost and effort will come down over time. Tools will improve, there will be more standard modules that can be reused, and therefore that equation of value over cost will improve," he says.
To date, the cost of speech recognition applications has slowed their penetration into contact centers. The hefty outlay of implementing these applications, stemming largely from software licenses and professional services support, makes them unaffordable for many.
Hosting, however, may help lower costs and mitigate risk, whether for DTMF or speech recognition. "Speech has been working successfully," Schoeller says. "But there's always the lingering doubt. Outsourcing the speech user interface expertise to a hosting vendor is attractive for a number of companies." He cites hosted players including BeVocal, Convergys, MCI, and Tellme Networks.
Speech's Changing Landscape
Along with pricing woes, lengthy deployment times and concerns over how to maintain and update speech systems have also put some companies on the fence. But as open standards and packaged applications evolve, the number of deployments may rise; Gartner reports that shipments in the speech-recognition telephony software market grew 17.4 percent in 2004. Aspect Software, Avaya, Cisco, Edify, Genesys, IBM, Interactive Intelligence, Intervoice, Microsoft, Nuance, Syntellect, TuVox, and VoiceGenie are just a handful of the vendors vying for their share of the voice self-service market.
Open standards such as SALT and VoiceXML are releasing organizations from the traps of technology silos and allowing them to customize. "It provides lower costs and investment protection," Hong says. Taking the custom-build route, though, is best suited for less complex applications, according to Enabling IVR Self-Service with Speech Recognition, coauthored by Jon Anton, Ph.D., director of research at BenchmarkPortal, and Paul Kowal, founder and president of Kowal Associates. Packaged applications are limited in their customization capabilities, but are equipped with prebuilt modules, which also makes speech more affordable.
"The speech industry is really changing," says TuVox's Martin. "In the future you're going to see more robust speech applications that come with a set of tools that allow companies to update [the system] more frequently using their own IT resources." Industry pundits expect deployments to grow as vendors work to get end-user companies and their customers to bite. To further increase adoption Mark Kowal, manager of product marketing at Interactive Intelligence, suggests that educating end-user companies is a vital element of the equation. "They don't realize how to evaluate their business and find the real keys to make speech work for them," he says. "If we can help educate the customer on how they need to look at their business and where to get the value," the speech recognition market's voice will get stronger.
Contact Associate Editor Coreen Bailor at cbailor@destinationCRM.com