Miner Inconveniences: How to Avoid Data Errors

Errors can creep into databases, bollixing the work of novice managers and experienced analysts alike. There are, however, ways to avoid errors, according to Sam Koslowsky, vice president of modeling solutions for Harte-Hanks. Koslowsky offered some helpful tips to marketers in a presentation he gave last week at DM Days, in New York City, kicking off with a pithy Freud quotation: "'It is the simple error that causes havoc, not the complex one we all fear.' He probably wasn't talking about a [CRM system], but the same thing applies." Following are six tips to avoid mining disasters: Set clear objectives. Too often novice marketers will become really excited about a response rate to a particular offer, but those customers won't subsequently buy anything. Worse, they might not pay for the things they do buy. "You've got to know what you want to happen at the end of the day," Koslowsky said. He gave an example of a credit card company using a model that accurately predicted who would respond. The problem was, the company didn't consider the credit "rule": The more likely a potential credit customer is to respond, the less likely his chance of repaying. Design the project effectively.
Allocate sufficient time and resources to aggregate the data into a form that's usable. There are three parts to a modeling project: prealgorithm formulation, data mining, and postalgorithm formation. "Marketers think data mining is the sexy part," Koslowsky said. "[But] two-thirds of the time should be spent on prealgorithm formulation." Some data mining projects can be data intensive. If it includes processing large numbers of records, companies must make sure their computer systems can handle it. Also, marketers should do their best to ensure data mining results can be used--what's the point of going through the exercises if they can't use the information they discovered? Prepare the data. Freeze files. If a company is trying to identify potential defectors, it needs to make sure it includes only the past period it is trying to measure to make sure predictors are coming from the right time. Koslowsky cited an example of a telecommunications company that build a model so it could intervene if a customer was likely to defect. The prediction rates were almost exactly on target. Unfortunately, the analysis included the time period after which many of those customers had already left, so of course it was accurate. The lesson, Koslowsky said, "If things look too good to be true, they are probably false." Before running a test marketers also should make sure nothing significant has changed from model development until model deployment. Also, incorporate all available data. "While data mining can accurately occur without contact history, inclusion can provide significant lifts," Koslowsky said. Explore the data. "To be a good data miner you have to get your hands dirty. Koslowsky gave an example of a company that found that 39 percent of its customers owned nine cars. Obviously the results were erroneous, but the question was, why? It turned out the company put a "9" in any spot where data was missing. Apply results accurately. Use common sense. When analyzing data, strange correlations may be uncovered, but that doesn't mean marketers should bank their future on them. Make sure to apply quality control techniques after completing the model. Koslowsky cited another example of a bank trying to solicit customers who may be interested in home equity loans. It developed a data mining model to locate the right segmentation, but forgot to include people who do not own homes. The lesson here? "It frequently helps to use common sense," Koslowsky said. "Simple errors can cause havoc in an otherwise good campaign." Related articles: What's in a Name? Tearing through systems to scrub dirty data and gaining a cultural understanding of names across the globe is no easy task. Optimize Revenue Through Life-Cycle Analytics
CRM Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues