A New Language for Effective Searches
Customers of Charles Schwab & Co.'s online brokerage services can find published research on companies and mutual funds simply by typing their questions into a query box. Schwab adopted this method on the premise that it returns more specific answers than the query-by-keyword searches common on most popular Web portals.
The technology behind Schwab's Web site is natural language processing, which promises to make it easier for people to find relevant documents amid masses of information. Natural language processing is built around linguistic search engines that parse questions for important keywords and match them to digitally stored documents.
New tools in this market space incorporate utilities such as taxonomies, which organize materials by structuring them into hierarchical groups, and ontologies, which can discern more subtle relationships between entities than taxonomies by including terms such as "part of" or "owned by." From there, natural language processing does "fuzzy" matching and "stemming" (examining roots of words) against the metadata and content of documents, so that the searcher's vocabulary need not be exact.
This feature can be useful to a busy knowledge worker, according to Alexander Linden, research director of advanced technologies for Gartner Group Inc. in Frankfurt, Germany. "Rarely do I find myself in a situation where I know how to ask a very concrete question, because it's tough to come up with a word that properly addresses your information needs," he says. It can also overcome the problem of users receiving different answers to the same questions because they enter different syntaxes to access the same information sources.
In the past two years, roughly two dozen vendors of natural language processing solutions have brought products to market. Many are trying to attract corporate customers by inserting their technology into mainstream business applications. For example, nearly all portal tools include linguistic search engines for matching keywords and phrases to in-house and external information sources, but most of them can't deal with the kind of free-form questions that natural language tools support. Proponents are claiming that the natural language search paradigm can empower portals because they have open application programming interfaces that can integrate natural language query search engines into portals. Similarly, vendors also hope to integrate natural languages with major applications such as enterprise resource planning, customer relationship management and home-grown e-business applications that require business intelligence.
However, this activity should not suggest that natural language processing can deliver productivity gains to all companies. In fact, Linden regards it as unproven technology, though it can complement other search paradigms such as keyword search engines. "Natural language query technologies offer an interesting proposition, but at best they solve only simple information needs," he says.
Vendors are approaching the corporate market from several angles. Some provide packaged solutions that customers can install on a Web server behind an enterprise firewall; others offer their products through application service providers. Coreintellect Inc. of Dallas bills its Core360 as a "natural language question-answering business intelligence portal" and sells it as a hosted subscription-based service.
Some natural language-based companies that launched their products as hosted services, including AskJeeves Inc. of Emeryville, Calif., and AnswerLogic Inc. of Washington, D.C., are beginning to sell packaged versions for use behind corporate firewalls and on Web sites, where they can help to address issues in sales and marketing. "The ASP model is only a proof of concept," says Dan Easterlin, director of product management for AskJeeves. "Companies see a need to integrate natural language tools into their Web sites. We analyze the questions asked by users relative to the company's ability to answer them through online content. This helps them identify content gaps and where their marketing messages aren't effective."
Discovering such gaps in content is an ongoing process, according to Debbie Naganuma, director of the Charles Schwab electronic brokerage in San Francisco, who chose natural language query tools from iPhrase Technologies Inc. of Cambridge, Mass. "We get weekly reports from iPhrase that tell us what type of queries are being asked and what we were unable to provide answers for," says Naganuma. "These reports also document where people are looking for information. They give us indicators of where we have content gaps that may have to be beefed up."
Morebusiness.com, owned by Khera Communications Inc. of Rockville, Md., uses AnswerLogic's natural language tools to help its users to find pertinent information. A business resource for entrepreneurs and startups, the site also relies on weekly statistical reports from AnswerLogic, according to Raj Khera, CEO of Khera. "If the reports show that we're getting requests for content we don't have, we'll go out and find it," he says.
Naganuma of Schwab agrees with Gartner's Linden that natural language tools are still in their infancy. But she expects the technology to become more sophisticated over the next few years.
Schwab spent almost nine months testing the iPhrase tool and integrating it into its Web site. "The biggest revelation for us was discovering how difficult it is to put a search engine on a Web site the size of ours," Naganuma recalls. "It takes time to make sure that everything is accurate for a good customer experience."
The most time-consuming step is developing a custom taxonomy and defining the business rules for searches. While all products come with general-purpose and/or industry-specific taxonomies, vendors acknowledge that human intervention and editing after searches will improve the accuracy of answers.
More sophisticated natural language tools are still two to five years in the future, analysts estimate. And they will have to convince IS managers of their efficacy. "Information professionals have objections and concerns about using natural language search engines," says Linden. "They don't feel as though they have control over what the tools do."