The Next Generation of CDI
Customer data integration today
Maintaining high quality customer information is an imperative for enterprises today. Organizations need constant access to the most current and complete view of customer information available. While enterprises have accumulated vast amounts of data about their customers, much of it is locked in silos distributed throughout the enterprise and can only be accessed from a single application for a single purpose. This motivates the need for customer data integration (CDI) to consolidate all the information about a customer into a single coherent view.
CDI hubs bring all customer data together into a single centralized database. The irony is that they overcome the problems related to customer information trapped in silos across the enterprise by introducing yet another silo. While they consolidate customer information, they don't deliver it to where it is needed, existing applications distributed across the enterprise.
Before examining the problems with CDI hubs, however, we should consider why they've become so popular. CDI hubs provide compelling advantages over the methods they've replaced. Enterprise application integration (EAI) is process-centric. CDI hubs, however, are information-centric. If you're trying to get a complete view of your customer information, an information-centric approach makes a lot of sense. In contrast to enterprise information integration (EII), CDI hubs bring all the information about a customer together in a single place, providing instant access to the data without putting additional load on existing systems. Any new approach to CDI must preserve these benefits of hubs while addressing their shortcomings.
To do this, the next generation of CDI must take a federated approach that stops the proliferation of new data silos and allows data to be accessed throughout a diverse, distributed organization. Federated CDI is a technique for managing customer data across a set of distributed databases in a consistent way. Unlike the distributed query mechanisms of EII which can put unpredictable loads on existing systems, federated CDI manages consistent copies of customer information where it already resides. This eliminates the need to introduce new silos, and the information is delivered in a form that existing applications can use.
Customer information is everywhere
Customer information is everywhere--or at least it should be. The value of up-to-date, consistent customer information spans the entire enterprise, supporting users in sales, marketing, and even the customers themselves. It also supports many different types of applications that operate on this customer information in a variety of ways. To facilitate this, CDI must deliver data wherever and whenever it is needed.
While CDI hubs provide a single view of the customer, they don't deliver it to the places where it is really needed. CDI Hubs provide a centralized, sole source of truth for customer information. While this enables new applications to operate on that consolidated view, it doesn't help existing applications that are still tied to their existing data stores and schemas. Federated CDI, on the other hand, delivers a projection of this integrated customer data to the existing applications.
In addition to supporting existing applications, federated CDI can support new types of applications. Federated CDI can allow users to operate on the data while disconnected from the home office, can support geographically distributed offices, and can meet the performance demands of Internet-based customer self-serve applications where the load is highly unpredictable.
Professional data management
While high-quality customer information is one of a company's most valuable assets, it isn't always managed accordingly. Database management systems have long provided a high quality of service for the information they store. These capabilities range from simply presenting up-to-date information, to being able to modify that data when necessary, to preserving data integrity and availability even in the face of adverse conditions. CDI Hubs on the other hand are only beginning to offer these same levels of service and in some cases face significant hurdles to fully achieving this.
As the enterprise moves inexorably toward "zero-latency" operations, the availability of up-to-date information becomes critical. It's no longer good enough to have information about your customer that was accurate as of some time yesterday. Many of the tools for moving customer data into CDI Hubs, however, have their roots in batch-oriented ETL designed to construct data warehouses once a day or perhaps once a week.
Operational systems require not only up-to-date information, but the ability to update that data as well. When a user of an application built on a CDI hub decides to modify some of the data that he's viewing, he should be able to update it immediately. In order to support updates, however, a hub must support transactions. While some CDI hubs now allow updates and even claim to be transactional, it is important to ask what this means. It isn't enough that the hub support transactions; the entire "system" of all the databases that share the customer information must be transactionally consistent, supporting the standard ACID properties database management systems are built upon. Without a federated approach that supports distributed transaction management across all of the integrated systems, this is not possible.
Evolving customer data
If CDI is recognized as a key initiative in the enterprise today, why isn't it universally adopted? During a recent trip to a cell phone store, the clerk was unable to help me because they didn't have my customer information available. They had merged with another cell phone company a year ago, so they recommended I drive to a location that had been part of the company from whom I purchased the phone.
The reason enterprises tolerate situations like this is, it's not yet easy enough to implement and evolve CDI solutions. The sad truth is that for large corporations today, the time it takes to integrate the data from an acquisition can be longer than the time it takes to complete the next acquisition. Addressing this issue is critical and a federated approach can make CDI easier to introduce and evolve.
Implementing a CDI hub starts with the master definition of a customer. Every application that deals with customer information has unique requirements, so coming up with a single schema that captures all of these requirements is a monumental task. It gets worse as each new application introduces new requirements. Service oriented architectures, which reduce the granularity of business logic from an application to a service, further exacerbate this problem.
Federated CDI solves this problem by managing disparate schemas while keeping all the data consistent. Each application maintains its own schema. Schemas can be extended and evolved as applications change and new applications are added. This agile approach to data integration allows the enterprise to manage change effectively.
The Next Wave: Federated CDI
CDI hubs represent a significant advance in providing all users with constant access to the most current and complete view of customer information. To deliver on the full promise of customer data integration, however, there are still a number of challenges to overcome. CDI must evolve to deliver customer information wherever and whenever it is needed. It must provide the qualities of service that are expected in traditional database management systems. Finally CDI must adapt and evolve. Federated CDI accomplishes this by preserving the benefits of CDI Hubs while addressing their shortcomings.
About the Author
Ken Rugg is vice president of real-time data services at Progress Software Corporation. He is responsible for the strategic direction and development of the ObjectStore, object database, and real-time data services product lines. Rugg studied computer science at the Massachusetts Institute of Technology. Please visit www.progress.com
If MDM's So Great, Why Aren't Companies Mastering It?
Two recent reports throw a bit of cold water on master data management, despite significant benefits and business process gains.
SAS and SPSS Hold the Lead in Gartner Magic Quadrant on Data Mining
Magic Quadrant for Data Mining '08: Packaged applications are all the rage -- but as the pair atop the list remains the same, a Visionary emerges overseas.