The United states Postal Service (USPS) may not be daunted by rain, sleet or snow, but it has a tough time delivering its own information to its marketing staff. Until the early 1990s, this information was largely in the form of paper files stored at USPS headquarters in Washington, D.C. In 1994, the service created a database called MarketTracks, which contained abstracts of these files. That data included title, author, number of pages, a brief description and keywords; access to it was via a client/server system. MarketTracks was moved to the Web in 1997, allowing remote users to read the abstracts. These were obvious improvements, yet users still had to find information through keyword searches, a "hit or miss" process, according to John Gregory, marketing specialist for USPS.
By its nature, keyword searching limits users to a single topic rather than a general concept. The USPS keywords had even less chance of success because they were often assigned by clerks who weren't subject matter experts. For example, the USPS's marketing staff might search for details on a particular retailer before a sales call. But a keyword search on that retailer's name would fail to return documents relating to the retailer's competitors--information that might be useful on a sales call.
The USPS's marketing information comes from many different sources in many different electronic formats. It includes market research from analyst companies and industry news gleaned from wire services and Web sites. The source material is found in personal productivity applications and databases, HTML, XML, PDF and other formats. What the service needed was some way to structure its information stores so employees could go directly to the information they needed.
For help Gregory selected Semio Tagger, content categorization software from Semio of San Mateo, Calif. Tagger scans text to create a taxonomy--a hierarchical system of classification that groups documents under appropriate subject headings. The software uses language detection, proximity analysis and stemming and normalization rules to extract important phrases on which it bases the taxonomy.
"Taxonomy products help organizations to better understand what they know," says Carl Frappaolo, executive vice president of Delphi Group in Boston.
The USPS uses this information to compete better with private companies for corporate delivery services. "In one place, we can find information about a particular customer we're trying to sell, the customer's industry and the demographics of the consumers the customer hopes to reach through direct mail," says Gregory.
If automatic categorization sounds too good to be true, maybe it is. Andrew Warzecha, vice president of e-business strategy with Meta Group in stamford, Conn., says these tools can help to categorize documents, but the results still have to be reviewed manually. And the organization has to "teach" the software to understand terms specific to its business. "It's a fallacy that automated categorization tools will free you from all categorization work," he says. "KM has to be more relevant to users' specific needs than a taxonomy developed by a program aimed at many different businesses." However, Warzecha adds that taxonomies are an integral part of most KM systems, and without the help of automated categorization tools many organizations could not complete the job themselves.
A Never-ending story
Gregory agrees. He says, "Taxonomy design is a continuous process--partly automated, partly manual. It starts before deployment and should be regularly evaluated." Gregory began his deployment by running Tagger on a group of sample documents. He then began to tweak the categories that the program generated. Once he had defined the categories preliminarily, he had to teach the system to distinguish between types of documents. For example, the system had to be able to tell "advertising mail," which is a product, from "advertising industry," a market. The program learns by example, so Gregory provides examples of documents along with the categories those documents belong to. "This is an iterative process. I expect it will continue to get better the more we work with it," he says.
The Postal Service is now able to transition from product teams (such as package services, ad mail, first class and special services) to industry teams (services, retail and manufacturing) by reducing the time it takes for employees to get up to speed on new industries. "If our people want a specific document, they'll still try to find it using keyword searching," Gregory says. "But if they need to educate themselves on a topic, the taxonomy makes that possible."
The product matrix
In an effort to make our technology tools reports more useful, Knowledge Management now regularly includes in "Ways & Means" a product matrix similar to the one at left. Its vertical axis lists the categories of information found in corporate repositories, while the horizontal axis identifies the processes that will convert information into actionable knowledge. The intersection point provides an at-a-glance view of the role a product can play in a corporate KM strategy.
The concept was created by Tom Housel and Art Bell for their book Managing and Measuring Knowledge (McGraw-Hill, 2001) and appears in Knowledge Management with their permission.