It's All About the Data ~ How the Use of a Business Rules Approach is a Critical Success Factor for a Data Project (Part 1)
Alicia had just joined the team, and I was assigned to help her get started on her projects. One day we went to lunch and talked more about our projects, and got to know each other a bit better. We are technical business analysts on the Data Management team, working in the Enterprise Data Warehouse (EDW) area. Our primary role is to elicit, analyze, and document business requirements for projects that would support the strategic analytical objectives of our company. I had been working in this environment since the beginning of time. While Alicia has a very strong background as a requirements analyst, she has not had a lot of experience in the data management area. We realized that we came from very different backgrounds and had worked on a variety of diverse projects. Right away we started to share some ideas and learnings about what makes a successful project outcome.
In the course of this conversation, Alicia and I discovered that the projects we worked on were as different as our backgrounds. We found some common ground in a surprising place … we were both familiar with the Business Rules Approach … and we found that we were both using this approach on our very different projects.
Alicia had worked mainly on what I'll call "Application Development Projects." I had worked only on what I'll call "Data Projects." Alicia needed to know what was different about a Data Project, and how to work with the business when gathering requirements on this very specialized type of project.
Over the coming days and weeks, Alicia and I worked together to understand how using a Business Rules Approach ensures success for a Data Project, and I was able to give her some specific real-world examples. I also introduced Alicia to our team of Data Architects, QA Analysts, and Developers who all support and benefit from a Business Rules Approach.
Using the Business Rules Approach enables us to create deliverables so that the business recognizes its requirements, developers can use them, they pass quality assurance tests, and the overall system will be accepted with confidence. On a Data Project, this means that there are additional deliverables required that really help nail down the requirements. These deliverables provide a clear understanding of the requirements, and everyone is in agreement about what is being developed.
Application Development Project
For Alicia to really understand how the Business Rules Approach is applied in a Data Project, we had to agree on a definition of an Application Development Project. Alicia defined the Application Development Project as one where the end result is a visible application with a graphical user interface (GUI) that performs specific functions within a business. An Application Development Project always includes data, and sometimes includes reporting.
At the start of an Application Development Project, the focus is on breaking out the business processes and putting them into a process model. End-users and stakeholders then validate the business process model (BPM). The stakeholders all agree to the BPM as the workflow helps us drive to requirements and rules. Additionally, the BPM
- Ensures a good understanding of the workflow requirements and rules,
- Identifies all key "players" and includes the end-users in BPM validation,
- Identifies "stakeholders" on BPM signoff.
Another major deliverable of the Application Development Project is a set of validated use cases. Use cases clearly show the actors who perform an activity and what the activity entails. Throughout the use case modeling process the analyst defines and documents business rules and then associates those with the actors and the activities. This provides a visual representation of the task. Use case modeling is important to getting the business agreement, and it helps drive to those definitions and details surrounding the business rules. Use case models show the relationships of activities, including a hierarchy and dependencies that are also translated into business rules.
A simple use case diagram will show who performs an activity, what the activity entails, when the activity takes place (both timing and frequency), where the activity happens, and how it is done or if there are constraints.
The by-products of the use case model are:
- defined and refined data model,
- list of business rules,
- security design requirements.
The third major deliverable in this type of project is produced when the use cases are translated into the Graphical User Interface (GUI) Requirements and Designs. It is these that will enable business process flow and implement business rules within the workflow. The GUI enables the implementation of business rules and exception handling. So for example if a rule is violated during the validation process (for example, clicking OK in a dialog box) then a message is displayed that should tell what rule was violated and how to resolve or make the correction. This is where the business rules meet the business and become visible.
As I was listening to Alicia describe her project activities and deliverables it became clear to me that in the Data Project, business rules are not going to be immediately visible or apparent to the requirements analyst working with the business as they are in an Application Development Project where use cases and business process models are created.
An application can, and frequently does, validate data by applying business rules within the user interface by responding to a user's input. In a Data Project there is not necessarily an activity available that will provide a rule validation. The EDW will source data from a file or a system with no real visibility to how the data got there, what rules have or have not been applied through the application, or what rules may have been violated allowing bad data through the process.
Alicia told me that if you've done your job as a Business Analyst on an Application Development Project then you have documented all the activities, rules, and data required to complete a workflow and meet business objectives. You have documentation that will provide the contract with the business as well as provide the information needed to turn over to a development team for design and build. That made sense. She was right that both projects follow the same methodology: Requirements, Design, Build, Test, and Implement. She was now ready to learn about a Data Project and how the approach and the deliverables needed in order to be successful are different.
Anatomy of a Data Project
The common factors between the two types of projects are data and business rules. A Data Project is defined as one where the end result is a repository of data used primarily for reporting or analytics. A Data Project does not have a GUI of its own, although the end result usually produces a view to data that can be accessed using a variety of tools that have a GUI.
A lot of Data Projects will fail to meet business objectives because they omit analysis of business rules. We don't often have all the business rules documented when we start in a Data Project. They are usually hidden in the source data or embedded in one or more source applications. Most business requirements approaches assume you are building business rules, not extracting them. That is why many analysts don't make use of a Business Rules Approach when gathering requirements on Data Projects. They just don't realize that this is what is needed in order to get all the rules documented and agreed on.
Figure 1. Anatomy of a Data Project
Figure 1 shows what a typical data warehouse looks like at a very high level and is how a Data Project gets structured. You can find diagrams like this in industry literature in various forms, all following the same general three-tiered or sectioned form.
An analyst must produce deliverables in each section for a Data Project. These deliverables will be both business and technical and are designed to communicate the right details to the right people.
Data Acquisition — Source Systems
The first component of a data warehouse is its source systems, without which there would be no data. These provide the input into the solution and will require detailed analysis early in any project. Important considerations in looking at these systems include:
- Is this the true source of the data you are looking for?
- Who owns/manages/maintains this system?
- Where is the source system in its lifecycle?
- What is the quality of the data in the system?
- What are the batch/backup/upgrade/maintenance cycles on the system?
- Can we get access to it?
The deliverables for Data Acquisition include:
- Metadata for both business and technical use — this tells what data will be acquired from the system.
- Interface Specifications — these will detail what will come in the interface, and how the interface will work (push/pull, real time/batch, format, platform, etc.).
- Data mappings and business rules — what data from the source is moved into the EDW and where, and what (if any) business rules are applied at this stage. Usually no business rules are in effect at this time; it's usually just data movement.
The processes for Data Acquisition are referred to as "Load" and include moving source data from the interface to an EDW staging area and then to one or more databases/data marts.
Data Warehouse sits at the heart of the system; it is the point where all data is integrated and the point where history is held. The Data Warehouse is the store of the lowest level "atomic" data.
Data that is loaded here will be clean, consistent, and time variant. The design of the data model in this area is critical to the long-term success of the data warehouse.
Documentation in this section takes the form of:
- conceptual, logical, and physical data models,
- business and technical metadata.
EDW creates delivery processes based on business requirements. This is where business rules are most often applied. They take the form of things like: filters, transformations, aggregations, calculations, audits, lookups, and other types of operations required. But nothing in the EDW is actually changed; it's just how the data is delivered.
Alicia was having trouble with this concept … so here's a way to think of it. Say that you give three different people the same granular ingredients: Flour, sugar, butter, milk, cocoa powder, baking soda, salt. You give them a basic recipe for cake, and tell them each to go away and bake a cake. When they all return, one person brings cupcakes, another brings a sheet cake, and the third brings a layer cake. It's all cake … it's all from the same ingredients, just delivered in three different ways. Each baker used different "rules" to produce their version of cake.
Delivery processes are commonly known as "Extract" processes and are documented in Interface Specifications and Data Mappings, including business rules.
Data Delivery can take many various forms, two of which are:
- File delivered to external system which delivers data through a User Interface,
- Views from which a Business user can query using a tool such as SAS or Business Objects.
Once Alicia had a chance to digest the EDW environment and learn all about the deliverables produced for each layer, we started to talk about documenting business requirements. Requirements for a Data Project have three components (acquisition, storage, delivery) and every requirement will be found in all three places, within the various deliverables.
Data Acquisition Analysis
In the Data Acquisition Analysis, we are going to produce four deliverables to the project:
- Source Data Definitions,
- Source Application Rules,
- Inbound Interface Specifications,
- Data Mapping (Logical) from Source to Target.
In a Data Project, each requirement must be decomposed into its granular parts and documented in a data model. The project will find the correct source for each requirement based on the documented definitions.
Let's take an example: Say that we have a business requirement to know the total number of booked room nights for a resort for a month.
The requirement is decomposed into its granular data parts.
- Total Number of Booked Room Nights
- Room Night (Composite Noun Phrase or maybe some kind of calculation like rooms x nights?)
- Room (Noun)
- Night (Noun)
- Booked (Status — State Transition)
- Total Number (Algorithm — Sum — but of what?)
- Resort (Noun)
- Month (Time Dimension)
The Business Rules Approach is used to create a fact model. The fact model is used to create a glossary of terms. Each term in the fact model is then mapped to an attribute in the source. The source attributes are documented along with any source application rules and an Inbound Interface Specification is created. Based on the fact model and the mappings from the fact to the source, we can also create a logical Source to Target data mapping.
Each requirement is decomposed and documented in each Data Acquisition deliverable.
Data Storage Analysis
At this stage, we are going to further refine our requirements into the ultimate EDW Data Storage container. The Data Warehouse is the store of the lowest level "atomic" data. Data that is loaded here will be clean, consistent, and time variant.
We are still documenting the same business requirement: Must have the ability to know the total number of booked room nights for a resort for a month.
Using the fact model and glossary produced in the Data Acquisition Analysis, we can now produce the conceptual, logical, and (ultimately) physical data models required. The Business Fact Model leads nicely into a conceptual model by identifying all the nouns, or subjects, required. The Glossary also allows the data model to acquire definitions. Depending upon the project and the status of the EDW, sample reports and functions might be documented from the models.
We have identified where a "Room Night" is found in the source database, and we can map that concept into our EDW. We know how the source deals with status, so we know how a "Room Night" transitions between different states and can capture that information in our EDW. We know how to find the geography and the time dimensions for each Room Night in both systems. We have produced a document that gives specific interface requirements so that the source system can send the correct data and the EDW knows what to expect.
Data Delivery Analysis
In this stage, the EDW becomes the Source and the delivery of the data out to business systems or end users becomes the Target. Are we having fun yet?
We are still documenting the same business requirement: Must have the ability to know the total number of booked room nights for a resort for a month.
A new level of analysis takes place to identify and document all of the target attributes. Let's say that we are mapping our data into an outbound external interface file.
We will produce a data mapping from the EDW source to the file target according to the business rules that we must apply to outbound data, as well as creating an outbound interface specification.
The business rules in this example include:
How to identify a "Booked Room Night"
How to identify a "Resort"
How to identify a "Month"
How to sum the "Booked Room Night" for a "Resort" for a "Month"
Having each requirement documented around these three things (acquisition, storage, delivery) will help ensure business acceptance and success in any Data Project.
Requirements Gathering Pitfalls
Alicia was wondering at this point where these business requirements come from in the first place? She was thinking that they are really just a big list of data attributes, and you figure out where to get it and then give it over to the reporting system. That's not an uncommon thought, and while it might seem logical you need to gather business requirements using a methodical approach, not unlike a Business Rules Approach, or else you will encounter some very real pitfalls and problems.
Pitfall #1 — Just Give Me Everything, I'll Use What I Need
How many of you have had a business person say this to you? What they usually will do is bring you a source system and ask you to bring in everything, in a scattershot approach. The business manager says, "Just bring in all the data. We'll be able to cherry-pick and use just what we need when we need it."
That sounds pretty good, except that the business manager probably doesn't know enough about the source system. What you will risk is ending up with unsupported business strategies because in the end you didn't really give them everything they needed. You can, and probably will, miss major areas of information because they didn't start out by telling you what their major strategic needs were and allow a full analysis to determine which data sources must be used to support the strategy … there may be other sources that they don't take into consideration with their initial request.
Pitfall #2 — Just Give Me What I Want, I'll Know What To Do With It
How about this one? I call it the mind reading approach — what happens here is that they give you a list of "data elements" and tell you that it's not important what they all mean; we (the business end users) really do know what it is.
That sounds pretty good, and seems to make IT's job pretty easy … except you risk ending up with ambiguous information and very questionable data quality. From an Enterprise Data Warehouse perspective, it makes the data unusable beyond the one department and creates an internal stovepipe. What will happen when the business changes, what will happen when different questions must be answered, how questions around the data will get resolved … none of these issues get addressed when you allow this approach. These issues are not just issues for IT; they are ultimately issues for the business because this limited approach affects their ability to perform valuable analysis.
Pitfall #3 — I'll Know It's Right When I See It
The third pitfall comes when you've gotten through the data requirements, and now you need to figure out the test criteria and process. With no clear end in sight, and no criteria or process to get to system acceptance … the results become pretty obvious. Neither business clients nor IT Management will be at all happy with this situation. Using a Business Rules Approach will help the project team avoid the testing issue and finish the project more smoothly, because the rules themselves will feed into, and in some cases become, acceptance test cases and acceptance criteria.
Alicia thought these sounded like pretty common directions from the business, and wondered if there is a way to avoid these pitfalls, get complete requirements, and get good results?
Yes … there is a way to avoid all these pitfalls, and it's what I call the Business Rules Approach.
In Part 2 you will find out how Alicia learned to enhance the project lifecycle methodology she was already using with the Business Rules Approach in order ensure a successful Data Project implementation.
# # #