|This column originally appeared in the March/April 1994 issue of the Data Base Newsletter.|
Consolidation of data is not the same thing as integration. Allow me to illustrate.
Imagine that the U.S., Canada, Cuba, Jamaica, and Japan were completely closed societies. However, to invigorate their sports, each allows one visitor for one day each century to bring a new game idea. For the 1800s, baseball is chosen. Abner Doubleday arrives in each country, and talks about
bases, runs, strikes, balls, innings, and the other things of baseball. At the end of the day he leaves. Nobody knows what happens until the next visit, a hundred years later.
As it turns out, each country has embraced the game enthusiastically, and each now has a century of competitive results. So taken are the countries with the game, they express a desire to merge their results, and to begin an international league. A commission (complete with data modelers) is chartered for that purpose.
The initial results are promising. Each country uses the same nouns (i.e., bases, runs, strikes, etc.), and many of the same verbs (e.g., "runs scored in an inning"). A static data model emerges that meets general approval.
Then the trouble begins. Not surprisingly, substantial differences have arisen over a hundred years in how each country plays the game.
- In Japan, a batter loses so much face after one strike that that's all he gets
in each at-bat.
- In Canada (being a tolerant country), an unlimited number of strikes per at-bat
is permitted but, to compensate, only two outs per inning.
- In Cuba (being a socialist country), strikes are charged to the pitcher, rather
than to the batter.
- In Jamaica (suffering from earlier English influence), only two bases are used, the ball is bowled, 'pitch' refers to the playing field, and games may continue for two days.
Can the results be merged? Yes and no. Because they are based on the same data types, they can be consolidated (lumped together). However, because each country plays the game differently, they cannot be integrated (compiled meaningfully). Without common standards for how the game is played (a. k. a. business rules), integration is impossible.
Implications for database professionals include these.
- The concept of an 'open repository' (store-and-play passive dictionaries) is
DOA. If development techniques and tools fail to follow common rules, the
'game results' will not prove sharable.
- Designers of data warehouses who promise more than consolidation should beware.
- The claims of information engineering notwithstanding, static data models are not enough to achieve integration of business practices. To play ball you need business rules.
# # #