The Phased Approach to Mining Business Rules
One of the core tenets of the Business Rules methodology and technology is externalization of business rules from the rest of the system. By 'externalization' we mean:
- Separation of the business logic from system programming logic
- Separation of the business language from programming language
- Presentation as declarative statements, easily understood by our business experts
This article addresses the issue of mining out and externalizing Business knowledge from complex legacy systems as Business Rules. More importantly, it talks about how to go about it in a phased manner.
A similar rationale drove the wide spread use of COBOL (Common Business Oriented Language) as a programming language – so that non-technical business users could understand it. But, today we know program complexity and mixed concerns defeated its initial purpose. First, COBOL tended to mix business-processing logic with system processing concerns. Second, it was procedural in nature and so growth in functionality caused an increase in complexity. Third, as the complexity increases comprehensibility decreases.
Hence, it is not surprising to see core business functionality in legacy systems implemented as humungous and practically in-comprehendible COBOL programs. Over time, important business knowledge is trapped in these programs and leads to the business case for externalization of that knowledge, in a format that the Business Users can understand and operate upon.
Externalization and consolidation of business knowledge is getting to be a necessity for organizations that have to evolve and adapt. This is where business rules fit the bill very nicely by articulating this knowledge as discrete non-procedural statements about the business, in an English-like language. To do this, however, we need a domain specific vocabulary and a language to articulate the rules of the domain as discrete declarative statements.
The key point here is that the business users not only understand the language and vocabulary but also have a complete sense of comprehension. However, by having a large number of rules, the total comprehension level again tends to drop and the purpose gets partially defeated. To mine out the rules from a legacy system, we have to rely on many knowledge sources, which often times involve a combination of the following:
- Business Subject Matter Experts,
- Some documentation (if you are lucky),
- Large, convoluted badly-battered legacy program(s).
Business rules mining is a gradual, iterative process of unraveling knowledge. An agile, well-represented rules team and methodology lend themselves well to this process.
The tooling is also very crucial. A functionally rich BRM (Business Rules Management) tool that supports standalone testing, searching, and reporting on Rules is very important. The first pass of such an exercise could yield many Rules. It is important to empower the business experts and business rules analysts, giving them a play area to test the vocabulary and rules.
The complexity factor in proofing rules usually seems to be a question of numbers and a question of articulating business logic better. An optimal vocabulary is a vocabulary rich enough to communicate the rules succinctly, but having minimal synonymous terms.
The more vocabulary and rules we have the more testing and proofing we need, to ensure quality and accuracy. In our day-to-day lives, we always look for rules of thumb, a general heuristic (so to speak). A basic under-pinning of the human intellect and thought process is to stereotype and categorize information. We humans love to generalize. Granted, not every phenomenon in our lives always falls into the stereotypes we create. Yet it makes our thought processes simpler. In the cases where stereotypes do not fit, we create exceptions and move on.
We could apply a similar concept to the Business Rules we mine out. I am currently working on a complete rewrite of a legacy Claims Processing application at Delta Dental of Michigan. Our business rule experts consists of an ex-Dentist, an ex-Dental assistant, and Dental Professional review staff. In short, these people have a deep knowledge of dental terminology, and they have worked for Delta Dental for a long time.
Within our Claims Adjudication processing, there is an area of complex logic where the adjudication of the claim being processed is dependent on the Patient’s Claims Processing history for that procedure or related procedure(s). This category of Claims Adjudication processing is called "History Cross Checking." A subset of our History Cross Checking Claims Adjudication logic comes from a long and complex COBOL program (50,0000+ lines of code), which has built up over the last two decades, and sometimes the rules are hard to articulate. This subset of rules deals with the allowances for fillings on teeth.
The complexity of this domain can be attributed to the following three aspects:
- Dental Terminology and Technology:
- Approximately 575 Dental Procedures, out of which about 12 Dental Procedures are pertinent to this subset of processing.
- We have 32 Adult teeth (Permanent Dentition).
- We have 20 Baby teeth (Primary Dentition).
- 5 surfaces to a tooth.
- Areas of Oral Cavity.
- Overlapping and Adjacent Tooth Surface considerations.
- Technology variations for filling materials used.
- Distinctions between Anterior (Front) and Posterior (Back) teeth.
- Contractual constraints:
- Allowances based on the contractual agreement with Client/Group.
- Allowances based on the contractual agreement with the Participating Provider/Dentist.
- Business Policy and Regulation:
- Compliance with CDT (Current Dental Terminology).
- Compliance with Business Policy.
- Evidence based best practices.
The combinatorial explosions of rules that arise from articulating the intent of the above concerns can be quite overwhelming.
Therefore, for our first phase, our Business Expert responsible for this subset of History Cross Checking processing diligently mined out over 600 rules, with vocabulary addressing Dental Procedures, teeth, tooth surfaces, areas of oral cavity, etc. The first phase took us through several iterations, and continual proofing and unit testing helped uncover some of the flaws. These rules had a 'reverse engineered from a program' flavor to them. Not surprisingly though ... this is a typical symptom of reinventing our legacy enterprise software.
Our first rounds of testing were successful, all our vocabularies catered to the semantics, and life was good. During parallel testing, however, we started noticing some holes. One day, the Business Expert tasked with writing these business rules came by my desk and said she needed to add 1800 more rules to cover up for these unforeseen situations. Now this seemed to be a clear red flag that we were missing something for sure.
Our vocabulary was apparently lacking in communicating a complex business goal. This was the start of the second phase of rules analysis. We called a Rules team meeting to figure out the issue. After some brainstorming, we started to notice a pattern evolving.
The main intent of our rules was to disallow similar restorative procedures on the same tooth and surface, when performed by the same dentist, when they performed it within a certain period. The monkeywrench in the works was to make allowances for certain common adjacent surfaces, even if that surface was worked on within the time period. The rules that had been articulated thus far specified explicit combinations of tooth surface patterns for respective tooth combinations, which had to be contrasted with tooth surface patterns for the same teeth. Part of the problem was also the fact that the main knowledge resource was the COBOL program, so the initial rendition of the mined rules was closer to the business logic code in the COBOL program.
In essence, we were facing a combinatorial explosion, with the current vocabulary soon promising to spiral out of control. We desperately needed to identify the rules of thumb, the exceptions to them, and the consequences of their handling. Many times, it is easier to think of business rules as a multi-layer weave of constraints. I call this the 'drop net pattern'. We start with the lowest common denominator -- the 'catch all' rules. We can then supersede them with more rules for the exceptional cases, and so on and so forth.
Our goal in this case was to invent a vocabulary that did not need to specify the specific tooth surface patterns for starters, but had a way of communicating what tooth surface to convert. This got us to being halfway correct, as the rules would convert the tooth surfaces correctly, but the corresponding Dental Procedure Code for specifying the kind of a dental procedure was incorrect, per the ADA (American Dental Association) guidelines. There was a certain set of rules that we were invoking in our workflow, prior to the History Cross Checking rules, that were intended to cleanup any submission errors in claims and to make the claims adhere to ADA guidelines closely. We needed to invoke these rules again after the History Cross Checking rules but, the second time around, only when the tooth surface was changed by the rule. Therefore, in our vocabulary we needed a way to articulate that conversion cleanup rules needed to be run again.
This was when the Rules team had this epiphany. This was very much like untangling a tangled mess of wires that connect a set of electronic devices. Typically in such cases we use a color-coding or tagging scheme. We needed to separate out different functional concerns. By separating out concerns and letting rules of specific sub-domains do what they are designed to do, we start to better compartmentalize the problem. Had we tried to correct the procedure in the same rule, we would have again run into a combinatorial explosion of sorts, but this time with the procedures.
Therefore, we decided to chip away at the problem a small piece at a time. We could use the business process workflow to orchestrate the invocation of the different subsets of rules. The original workflow sequence was as follows:
- Invoke Cleanup Conversion Rules.
- Invoke Restorative History Cross Checking Rules.
- Next phase of adjudication.
Now, with this approach of splitting concerns, the workflow looked like this:
- Invoke Cleanup Conversion Rules on the Claim.
- Invoke Restorative History Cross Checking Rules.
- Check to see if a potential cleanup situation exists (this would be articulated by the History Cross Checking Rule that fired), and if so:
- Invoke Cleanup Conversion Rules on the Claim.
- Next phase of adjudication.
To draw an analogy -- in woodcarving or whittling, a complex cut can be achieved through several smaller cuts that follow these rules:
Rule of thumb: Whenever possible, you must always cut along the grain of the wood and never against it.
If we do not follow this rule, it results in the chunking or splitting of wood and we could ruin the project.
Exception Rule: If cutting against the grain of the wood is unavoidable, then use a stop cut and augment it with some sanding.
In this exercise, we followed two distinct phases of rule mining life cycles. In Phase 1 we built vocabulary and rules and tested the functionality. Phase 2 was triggered when we found deficiencies; that was where we looked for common patterns, generalized on those common patterns. We then built vocabulary to support that, enhanced and reduced our rules, identified exceptions to the generalizations, and built rules for that. We then proofed and tested those rules, which led to further analysis and refinement. Figure 1 highlights the process we took.
Figure 1. Two distinct phases of rule mining.
In conclusion, it is always easier to generalize, but then the generalizations are not applicable all the time and therein lies the rub. In the cases where the generalizations do not apply, exception rules need to be mined out to fill the gap. Regression and parallel testing are imperative for this to work. The specific problem that we faced could have taken our number of rules from 600 to 2400+ rules. This is a large number of rules to comprehend, maintain, and troubleshoot. However, by generalizing the rules of thumb, having exception rules to fill the gaps, using workflow to orchestrate and separate different sub-functions of rules, and having the complex rules drive the decision of specific orchestration, we reduced our rules down to about 34 rules.
The rules mining effort is best achieved in phases. Moreover, when it comes to sophistication and complexity, small is most certainly beautiful, elegant, and practical. So in hindsight, why did we not come up with this solution right from the beginning? The funny thing about knowledge and rules mining exercises is that they are pretty much like sculpting. We have to take away layers of material before the true beauty of the sculpture becomes apparent.
# # #