Push-Type Data Hub vs. Pull-Type Data Warehouse
|This column originally appeared in the Nov./Dec. 1998 issue of the DataToKnowledge Newsletter.|
I recently had an interesting conversation with a knowledgeable IT professional at a mid-size banking company. It gave me some interesting new insights about business rules in an age of application packages and data warehouses. What first caught my attention was his comment, "If we had taken a business rule approach ten years ago, we wouldn't be building a data warehouse today."
Wow! I might have said something like, "If we had taken a business rule approach ten years ago, we could be building a much better data warehouse today." But not " ... we would be building no data warehouse today." What could he be talking about?
As he related it, their interest lies with the whole-customer-banking-relationship issue. They want to leverage customer information for cross-selling products, and for improving customer service. Unfortunately, they have classic stove-pipe applications, and a myriad of bridges and interfaces between them. No integrated database. They can't even correlate the data from the different sources. (Sound familiar?) The wrinkle is that rather than systems built in-house, these applications are mostly application packages purchased from outside vendors. (Being a younger organization, they managed to avoid many of the problems associated with in-house development ... and instead moved straight into a wholesale systems nightmare. Oh well.)
Their data warehouse strategy involved what I would call a classic 'pull' approach. Data created in the application packages is to be extracted on a regular basis, and 'pulled' into a common data store. From there, it will be centrally available for other uses.
Although many companies take this approach, I think it fundamentally wrong. In 1991, I wrote a book (with Wanda I. Michaels) on a business modeling technique called Resource Life Cycle Analysis. An important part of that technique is to identify the enabling resources of the business. 'Customer' is clearly one of them. To make a long story short, a far better approach is to establish architectures that push business-enabling data such as 'customer' out to the other applications -- not pull it in from them.
In other words, what the company needs is a push-type data hub, not a pull-type data warehouse. This data hub would be a unified application where customer data is first created, then 'pushed' out to the existing applications. In other words, the data is exported from the hub, rather than imported to a warehouse. This way the company can dictate its standards for defining customers, rather than spending endless resources on cleaning-up after the fact. (If the design for the data hub and related exports is done well, no modifications to the application packages themselves should be required. Naturally, that is something to be avoided.)
The final piece of the puzzle is actually the easiest what approach should they take in developing the data design for the hub? The answer, of course -- business rules!
# # #