The Voices of Reason
Imagine being able to Google your customers' phone requests for information or the recorded files of their complaint calls, or being able to decipher when a customer interaction in one of your stores went awry. If you could query voice records the same way you do textual ones, you'd open up boundless areas of opportunity. Web surfers can already search audio files and audio/video feeds, but now enterprises can use this technology to help employees search voicemails or recorded calls for key words and phrases, and, in the end, to decode important customer concerns.
It would seem that the sky is the limit for this technology, which is why our colleagues at Speech Technology
magazine recently conducted a roundtable discussion with some of the foremost thought leaders in the audio-search industry:
- Judith Markowitz, president of J. Markowitz Consultants and technology editor for Speech Technology magazine;
- Anna Convery, senior vice president of marketing and product management for Nexidia;
- Yochai Konig, cofounder and chief technology officer for Utopy;
- Larry Mark, chief technology officer at SER;
- Joe Watson, director of advanced technology for Witness Systems; and
- Daniel Ziv, vice president of customer interaction analytics and business interaction intelligence at Verint.
Their insights and collective experience reveal what is -- and isn't yet -- possible in the field of speech analytics and CRM. The roundtable here is adapted from the April 2007 issue of Speech Technology
Speech Technology magazine: Can you describe what the audio search and mining market looked like three to five years ago?
Convery: Speech analytics was a really wonderful idea--something that wasn't being applied in the commercial world. Most of us brought products to market several years ago. In the commercial marketplace, we really only have been bringing products to market in the past two to two-and-a-half years.
Markowitz: There were some companies that did have technology--commercial or semi-commercial--a number of years ago, such as AT&T, IBM, Lernout & Houspie, and BBN. Those technologies were developed in the 1990s but weren't as commercially oriented as the technologies today. Essentially, the market is maybe three years old in terms of the commercial offerings that companies have that are really product-oriented rather than R&D and very early adopters of bleeding edge. This is no longer bleeding edge.
Ziv: We deployed our first commercial product in early 2003, but the market was very small and immature. The technologies that you are speaking of are really fundamental engines, but there were not speech analytics applications deployed in a commercial environment before three or four years ago.
ST: What happened in the last three to five years to further the market's growth and acceptance?
Konig: What has happened is the expanding of the scope of the business value that speech analytics can bring. The initial business value was more on the agent-quality-monitoring type of application. In the last few years, we have seen an expansion to other business benefits--more specifically, to the business intelligence side. Basically, companies [are now] able to get insight about different business processes that are not necessarily about the agent, but about the customer: what the customers' issues are, why they're calling, how the company can make them more happy, how the company can sell to them more effectively, and so forth. The expansion of the business value is fueling the growth of this market.
Convery: What you see is a maturity of how people look at speech analytics. Instead of it being a very efficient tool to go and look at what your agents are doing, now what we see is a very significant business case for development looking at strategic initiatives for organizations, getting that intelligence to the organizations in an efficient manner, and then having them act upon it. If you were to look at some of the business cases that were developed two or three years ago and compare them to today, you would see quite a dramatic difference, both in terms of sophistication and the return on investment.
Ziv: It's really about having more access to information, and the business intelligence market is part of that, but it mostly focuses on structured data versus unstructured data. Unstructured data is a very new market. The amount of unstructured data is much larger than the amount of structured data and the potential value there is much greater because it has much deeper insights. It is actual customers talking and telling you what they want rather than you just knowing how old they are and where they live. The value is there, and the potential is taking some of the tools, processes, and consulting around what has been done in the business intelligence community and applying it to this unstructured data and linking it to the CRM world and the business intelligence world. The area of greatest growth is in people realizing that this information exists, that it is available, that you can mine it, and that you can extract information from it. In 2010, this market is probably going to be much larger, but it is going to be combined with a lot of other things that are part of CRM initiatives, business intelligence programs, and enterprisewide information systems. These are multibillion-dollar markets that are ready, but just aren't using this information today.
Mark: In parallel to these applications moving up the food chain [going from just agent monitoring or quality assurance to more of a business intelligence application providing greater value] the technologies--both the underlying computers and the speed at which the engines process--have all increased. At the same time that we are providing more value, we are providing it, at least from a capital investment perspective, at a much lower price point because there is a much lower capital investment needed to get value out of systems than there was even three years ago.
Markowitz: Many companies have combined this with things that are already in the environment of the customers. They can see it as an extension of what they are already doing and it is easier to understand.
Watson: It gives us another portal into customer insight, into what the customer is really talking about. It is not the only portal, but it gives an additional feed of information so that businesses can now understand what a customer is saying and how he is truly reacting.
ST: What counterforces have been--and still are--stalling the growth of the audio search and mining industry?
Ziv: Like any new technology, there are a lot of misconceptions about what it is. Some people see this as part of or similar to speech IVR and compare the advantages and disadvantages of that.
It is a very different application, requires very different technology, and requires different types of services around it. While it does draw from the same core basic technology as speech recognition, it is very different in terms of its application in the market. That has caused some confusion and potential delays in deploying this because the company has already deployed speech recognition IVR and is drawing upon its decision from that.
The advantage for speech analytics is that speech recognition doesn't need to be as accurate as you would need for a speech IVR system because you have the advantage of statistical information. You have a lot of calls and a lot of words to look for, so even if you missed one word here or there--which you always do in speech recognition--it doesn't have the same effect as a customer talking to an IVR and getting upset because it didn't understand his last word.
Convery: [Our customers] expect speech analytics technology to generate very accurate results. They expect to be able to depend on it, especially when they look at the mission-critical applications where they are trying to find something that needs to be acted upon very quickly, like a breach in a compliance statement or customer identification information.
Ziv: I agree, but that is part of what is hindering growth. None of the applications that are out there today, and I have seen all of them, are 100 percent accurate. You may identify the word correctly, but if you don't understand the context of that word then you're missing the point. The perception is that we'll wait until the speech recognition technology is 100 percent accurate, and it's what potentially makes marketing and education a challenge for all of us because I don't see it being 100 percent accurate in the next couple of years.
Convery: I agree with that, but people expect to have a higher rate of accuracy. People are becoming more educated on what it can do, and a key point is that people who try to say that this can be 100 percent accurate are clearly misleading the marketplace. A high degree of accuracy is important and we have to deliver on that.
Markowitz: Something that has actually been a focus of this industry in terms of improving over the years is the issue of scalability. It was a problem initially, as with any other new, emerging technology, but it is going away now because there is more and more integration. Cost was more of a barrier than it is today, especially given the value that is coming through at the enterprise level. Also, a couple of the companies in the industry--and Nexidia is one of them--are offering managed services, which allows the market to extend down a little. Another area is quality reporting. The reporting that exists now is worlds above what it was before. Its usability in accessing the information was part of that issue.
ST: Where do audio search and mining technologies need to be pushed or prodded to evolve?
Konig: To identify relevant issues from a business point of view, we have to get into the world beyond the phrase and get into the context of the exchange between the customer and the agent. You have to be able to understand a sequence of phrases over a period of time. To get all of the information that the company wants and to get the meaning of it is very challenging from a technology point of view. That is the next frontier as all the vendors are evolving from getting a word or phrase here to getting the context of a specific issue, the specific dimension, the caller's specific exchange, and the order that it is all happening. This is a challenge from both an understanding and completion point of view. The technology is evolving to meet more and more sophisticated business needs.
Ziv: There is a lot more on the technology front, so I wouldn't call this technology mature in terms of that. It is mature in terms of the ability to deploy and get value from it today. As for the question of whether we've finished, whether we've done everything we can do with this information--we haven't even started. I draw upon the business intelligence market, where you spent years just cleansing data and pulling it together to be able to report on it. It became a billion-dollar market. I think the amount of data--the richness of the information, and what we can do with it--shows that, as we are currently applying data mining to the results of the speech analytics, we are taking a new technology and applying and combining it. There is a tremendous amount of technology innovation that we will see. The success lies with the application, the services, how it is deployed, where it fits in, and addressing concerns of customer privacy and the security of these very sensitive calls. At the end of the day, technology alone doesn't solve any problems; it has to come hand-in-hand with the right application, the right marketing, and the right education. But I do strongly believe that this is going to be a huge market and will weave itself into a lot of other spaces that we don't see today.
Convery: We've made so many strides with the technology and how we apply best practices around it. We really are delivering a lot of value right now, but in voice there is so much richness.
It is our favorite medium with which to communicate with each other. There is so much there that can be researched and developed and certainly I know we continue to very, very heavily invest in research and development and continue to bring innovative analysis of the voice and various things. It is not just the word, but all the other qualities of the voice as well. It is going to be a very interesting marketplace in terms of how we are really starting to deliver on the value, not only with some significant customer announcements over the next 12 months, but also with some very interesting technology announcements. It is a very active, growing, interesting marketplace to be in. It has tremendous impact across the entire organization.
Konig: Going up the food chain to the CEO and the C-level, [and] giving an education and showing the real value to C-level people who start asking about the types of results that they are getting--[that] will be very positive to the whole industry...because the value is getting more impactful to the bottom line.
ST: Which new vertical markets that haven't already benefited from audio search and mining applications are poised to do so now?
Convery: I see a lot of education going on within the enterprise for other divisions to use speech analytics. For example, I can go ahead and do my research based on what a call driver is, what the churn is, and what is going on during a call, and then use that intelligence in other divisions, like marketing, sales, etc. When it comes to other verticals, we are seeing additional attention from more services-oriented companies that are keenly aware of the quality of the service and the value of the service of the customers they are serving and getting intelligence from that. From hospitality to retail, all of those organizations are picking up on and starting to use speech analytics.
Watson: Any specific vertical is hard to say, but in particular, we see a lot of uptake in broadband and telecom, where there is a lot of churn and trying to maintain customer satisfaction is important. These industries are starting to see a lot of value in how to understand and potentially predict where the churn is and to maintain customers and keep those customers happy.
Markowitz: We shouldn't limit the discussion to the contact center because I know that some vendors are working with podcasts, media research, and things of that nature that go beyond the call center. That is, of course, where the technology is, though.
Ziv: In addition to different verticals, different types of interactions can now be recorded. We have already had several projects where people wanted to record their interactions at branches.
It is true with some of our telcos and financial institutions that they also have branches or locations where people are talking and getting information and then calling and getting different information. If the customer calls and gets different information over the phone than he received at the branch, you don't have the complete picture. That is another direction of expansion--into the field of "Where else people are talking." It goes beyond what happens in the contact center to what is happening everywhere else in the world as well.
Convery: It is not actually about whether this technology will be applied; it is, and it is growing very fast. In those marketplaces where you have more audio and audio/video, people want to search and they see its challenges. We have a lot of intelligence, a lot of findings, a lot of people speaking, we've got noise, we've got dialects, and those areas are growing as well. It has certainly been a very high-growth area for us.
Markowitz: I can see multilingual search and the consolidation of information as an extension of what has already been said. There is no reason, especially in a global company or enterprise, that you need to do your consolidation only in one language. You can combine it with translation and then combine the concept levels to see what is happening that might be similar or different across the different installations that you have.
ST: What are your expectations of the market during the next five years?
Ziv: The question is when some of these applications are sold, will they be combined with quality-monitoring applications, as standalone or as services? Datamonitor estimated the market for 2005 at $50 million and estimates that it will grow to $218 million by 2010. The majority of that is still in North America, but there is expected growth in India, Asia, and the Pacific as well.
Markowitz: Right now, the market is defining itself: When you look at these technologies, you really find that this is not something that you can say is standalone. It goes with other kinds of things, so it is part of various other markets.
ST: How pervasive do you expect this technology to be?
Ziv: Speech is, by definition, the most comfortable form of communication, especially about important things. If you look at the types of interactions that are self-served, it is usually the mundane, such as going to an ATM; but if you want to have a discussion with your banker about where you should invest, then speech is still the predominant form and probably will continue to be for many years.
The two areas where speech technology is being addressed are for machines to be able to understand when people talk to them and for what can we learn from the interactions that people are having. Both of these are the bases of the current knowledge and current transfer of communication. There are so many ways that you can apply this. Anywhere that people are talking to each other, there is knowledge that could be used for business intelligence, for companies to understand their customers, for personal usage. Just like you have an application that allows you to search your emails, you might be able to search your phone calls or other things. The reality is that technology is here to allow you to use this spoken information. The information that lies there is very rich, very powerful, and it really just amounts to the imagination and the right way to apply it. What makes it really hard to scope this market is that the potential is so large that it is hard to put a boundary around it.
Konig: If companies would use this technology to become adaptive, dynamic, and essentially to customize themselves to every customer based on their needs and the optimal way to serve them, it could only increase satisfaction, loyalty, and the value of the company. All of us as consumers will benefit in our day-to-day life. The business case that this technology can bring to enterprises is its ability to be more responsive, more adaptive to the customer, and better equipped to create positive cycles to support both consumers and enterprises.
Markowitz: There is a deployment of audio with video cameras on street corners of one town--a danger area where officials are listening and watching for threatening communications by terrorists and the like. That is a totally different world of use that would affect everyday life. But, when we think about this technology, there isn't really a boundary. Every time you think about it, you can look at it in a different way. People are using it in such creative ways that if we have this conversation next year, it will perhaps be completely different.