Context is King: A Practical Approach to Rule Mining
The advantages of a business rule approach are readily apparent. Lower costs for application management and modernization, improved regulatory control, and enhanced agility are all commonly achieved. But to manage rules requires that you have identified and constructed them. A way to accelerate this rules formulation is to investigate existing applications. After all, these applications contain the highly-tailored business logic that describes your operational behavior.
But how to extract this foundation of logic from your applications can be a significant challenge. Manually extracting rules can be a long and frustrating experience. However, many 'automated' approaches suffer from two fatal flaws. First, it can be exceedingly difficult to locate and understand highly interdependent logic that has been interwoven into millions of lines of software code. Second, rule miners may often fail to recognize that business rules are in fact creatures of the business and not simply technical entities.
An effective approach is to apply semantic structures to existing applications. By overlaying business contexts onto legacy applications, rules miners can focus effort on discovering rules from systems that are valuable to the business. Effort is redirected away from mining commoditized or irrelevant applications.
Further, best practices coupled with various tool-assisted techniques of capturing programs' semantics speeds the transformation of technical rules to true business rules. Adding business semantics to the analysis process allows users to abstract technical concepts and descriptors that are normal in an application to a business level that is consumable by a rules analyst.
This paper explores how contextualization and insight into existing applications accelerates the discovery and transformation of logic into refined, true business rules. It also includes a discussion of best practices around this semantic capture of technical rules and their abstraction into business rules.
There are various methods to rule mining that will be described below. Let's investigate preparatory steps that help to avoid protracted rules mining exercises, ensure that rules are valuable to the business, and speed the process of creating 'business' and not simply 'technical' rules. These steps are recommended to build the right organizational framework and to align rule mining with business priorities.
1. Establish Project Goals and Desired Output
It is important to define the goals and outputs of the rules mining activity. This will vary depending on the business priorities of your organization. For some, the goal may be to address inflexibility or high downtime in a highly valuable and customized application. This may lead to the decision to move to a SOA-enabled packaged application. Or perhaps the CIO may be under pressure for regulatory compliance and must ensure that she has control over how her operations behave. An application portfolio management tool can be useful for prioritizing these activities.
Depending on the goal there will be two primary outcomes that can be derived from a rules mining activity:
In some cases, your goal may be to develop business-centric documentation of program code segments. This documentation — valuable as reference for other analysts and subject matter experts — may also be used to decorate high-level design models for the modernized, to-be applications. It would not normally result in executable business rules or inputs into development environments.
In other cases, you must capture the exact semantic behavior of edits and calculations in your applications as true business rules. The captured semantics will be used as a basis for, and perhaps even directly fed into, target development environments. The outcome of this step will drive decisions regarding technology adoption, resource planning, and the best methods to apply for either type of rule mining.
2. Select a Rule Mining Technology
A low-cost, low-value application of technology for rule mining is the documentation of rules in word processing or spreadsheet applications. This approach will not help address the most resource-intensive portions of rule mining — like analyzing application behavior and locating rules within source code segments.
Runtime simulation technologies that facilitate the capture of user behavior into processes and rules can help from a behavioral aspect. Their main drawback is that they are limited to those test scenarios actually performed within a given time frame — potentially missing out on critical exceptions not usually performed. They are also fairly rough-grained, unable to capture underlying semantics. For instance, they are not well suited to answering questions like “have you just received a discount due to your location of residence or your customer status?”
Text scanning tools that locate patterns within sources can accelerate the capture of rules based upon fixed patterns within source code (e.g., showing all moves to a variable). These tools typically generate a large number of false positives and tend to lead to technical rather than business rules.
Repository-Based Rules Mining Tools
Repository-based technologies that recompile the sources and build a syntactic parse tree can offer the highest value in rule mining. The most advanced automatically detect rules using semantic analysis, e.g., each statement upstream from a point of interest variable, and which potentially impacts its value, is mined as a candidate rule.
Rule mining tools may also offer rule management and auditing capabilities, facilitating project workflow and rule maintenance activities by reviewers. They are also able to retain rule traceability to originating sources even when those undergo change (as they often do in a live environment).
Considering the above, the tradeoff of a low-cost approach to rule mining will be in higher manual effort and lower quality of results. This may be acceptable in one-off, small-scale documentation scenarios, but less likely to be so in enterprise-level modernization efforts.
3. Allocate Time and Resources
Your project goals and choice of technology will impact resource allocation. All too often, expectations from the business side are to receive a precisely-modeled set of business rules derived from the application semantics, without realizing the complexity of such an effort. On the IT side, practitioners without experience in rules mining or familiarity with the application subject matter are ill-equipped to offer reliable estimates.
Consider adopting an iterative approach toward time and resource estimation. Mining rules for a representative subset of the application over the first few weeks will provide you with insights into the methods that work best for your application, as well as a yardstick by which you can estimate the overall effort.
4. Map and Align to Business Processes
The logic that you will mine is important because of its business context. After all, logic is a subset of an overarching business process. Mined rules will make sense only when placed into context within the associated process. For example, assigning a prospective customer to an income bracket based upon her zip code might have different significance in credit approval and marketing campaign processes (witness insurance companies advertising that they will only look at past behavior and not credit ratings to set premiums).
Further, viewing your applications in a business process context is important in order to identify priorities for rules mining activities. An enterprise commonly views itself in terms of its processes — registering a new customer, underwriting a policy, receiving payment, registering a claim, depositing funds into an account, and so on. Some of these processes are of critical importance to the organization, while others are simply commodity functionality. Some processes may be meeting service level agreements set by line-of-business executives, and others not. Where modernization activities will focus will vary based on this calculus.
Clearly, your team will want to focus where value is high and alignment with business goals is low. Again, a business-centric application portfolio management approach would be vital here. In this respect, a mapping of your business processes — including the information shared between them — is commonly done. Once this is complete, you will select the specific processes to be targeted for modernization and business rule mining.
Your next step will be to identify which application elements support the chosen processes. Historically, these applications were organized in silos, making the high-level match a straightforward step. At a more granular level, however, a monolithic order management application may handle customer enrollment, credit approval, order entry, and order fulfillment. Since you may only be interested in mining rules for the customer enrollment and order entry components, a mapping exercise between processes and their supporting application portfolio elements will be beneficial.
If you are using repository-based software, register your application objects and automatically capture the syntactic and semantic relationships within them. At this point, you should be able to easily tag, diagram, and report on the high-level relationships of interest.
Note the cases of overlap, where a program or data store serves multiple business processes. In Figure 1, the Customer Master, Order Master, and Inventory data stores are within the scope for rule mining, despite the fact that they support additional processes outside of the scope. Similarly, the Customer Handling program is monolithic and includes logic for Credit Approval, outside of the scope of this effort.
Figure 1. Business Process Mapping to Application Objects
5. Analyze your Application
In the previous steps, you have inventoried your application and understood, at a high level, where the required business processes of interest reside within your application artifacts. We can now proceed to decompose the application into its constituent logical elements. A key factor here continues to be contextualization. We have excluded elements of our application that are not relevant to our priority. This scoping via context allows us to also see the 'boundaries' of our rules and their associated impacts, as we shall see.
In this step, you will decompose your application further into its detailed components. The goal is to have a sufficiently-detailed collection of artifacts to serve as input to the rule mining step.
Here too, technology can help. Repository-based software can create a parse tree of your detailed application objects and their relationships. This information is then presented in multiple graphical and textual views, synchronized with each other to facilitate context-sensitive analysis.
For example, a context view will display all data field declarations, procedures, and procedure calls within a program in a compressed and outlined mode. A detailed source code view will display code segments corresponding to the context view, allowing for quick navigation through the program via the context view. Traversal through the context view will enable you to gain quick insights into a program's structure and complexity.
Other views available at the detailed program level include diagrammatic control flow between paragraphs, logic flowcharts within paragraphs, execution paths, and runtime simulators for chosen conditional outcomes. Use these tools to gain detailed insights into your application prior to actually starting the rule mining phase. You may be surprised at the extent they influence your decision on the approach to take (more on this below).
If you don't have the benefit of automated parsing tools, you will still gain value by conducting a manual inspection and walkthrough of your application artifacts.
In the process of reviewing and analyzing your application, identify and mark the elements not to be included in the scope of rule mining, for functional and technical reasons. These may include standard utilities, reports, system routines, and out-of-scope business processes. In the example from Figure 1, this would include the artifacts related to Credit Approval and Order Fulfillment, out of scope due to their nature as commodity, standardized business processes.
The identification of exclusions illustrates a benefit to be derived from the up-front business contextualization steps described above. If we had not conducted them, a "broad sweep" approach would have resulted in a higher investment at the rule mining and SME review stages, where rules from the commodity business processes would first have been mined and then later discarded as irrelevant.
6. Create a Glossary of Terms
A major challenge with mining rules from applications is that it can be difficult to navigate the various variables and naming conventions within. These conventions have often a tenuous link to business terminology and can make understanding the logic from a business perspective difficult.
A best practice is to refine these technical terms to create a more business-centric view. This can be achieved through a glossary of application objects and related business terms. Objects could be data fields, paragraphs, programs, data sources, and other application objects of interest. Sources of information for a glossary of terms can be business documentation, data dictionaries, database schemas, user notifications, and even source code comments.
Automated rule mining tools offer a facility to propagate values for repeating patterns (commonly called 'tokens') within your application. For instance, the token 'ACCT-' may be replaced everywhere by 'Account-'. A tool would then use the glossary business names to replace technical terminology in the automated construction of candidate rules.
7. Define Rules Composition and Hierarchy
Establish your desired rules format in advance. The same rule to set an order discount may take alternate forms, such as:
- Declarative form: "Each applicant who is a senior AAA member from California receives a 5% discount."
- If-Then-Else form: If an applicant is a senior, then if she is an AAA member, then if she resides in California, assign a 5% discount.
- As an entry in a decision table:
Resident of CA?
Order discount level
Figure 2. Entry in Decision Table
It will be useful to attach to a mined rule additional informational and workflow attributes, such as:
- Reviewer text annotations
- Rule type (I/O, calculation, validation, security)
- Audit status (approved, not approved)
- Workflow status (extracted, working, accepted, rejected)
- Transition (valid, requires modification, duplicate, complete)
- Reviewer Identity
- Program derived from
- Code segment location (start, end)
- Code segment text
- Input and Output data elements
A rule will also be placed within a hierarchy. All rules representing a decision or executing under a given set of conditions may be grouped into a Rule Set. Rule Sets will be grouped into higher-level activity nodes reflecting the business processes they currently participate in.
If you are planning to populate an executable Business Rules Management System (BRMS), you will want to define a schema that is easily transferable into the specific target environment chosen. If the target environment is not yet known, refer to available business rule standards.
8. Establish a Rule Mining Workflow
Enterprise rule mining is usually a multi-step process involving practitioners with disparate skill sets — including consultants, developers, architects, analysts, and subject matter experts. Often key personnel will be distracted by other projects, and it is therefore crucial that a common workflow be defined and documented.
Following the guidelines provided further below, a high-level workflow may appear as:
No. Step Deliverable Owner Participant 1. Mine
Candidate Rules Team Lead Developer
2. Verify Candidate Rules Verified Candidate Rules Systems Analyst Systems Analyst 3. Transform to Business Rules Format Business Rules Model Business Rule Modeler Systems Analyst
Subject Matter Expert
4. SME Review Approved Business Rules Subject Matter Expert 5. Report Business Rule Reports Subject Matter Expert 6. Integrate Rules within target toolsets Architect
Figure 3. Sample Rule Mining Workflow
Each step should be defined in detail, following your adopted methodology, project scope and constraints, and rule mining technology usage. There may be multiple iterations of the first few steps until the rules are in an approved, final format.
Rule Mining Steps
9. Mine Candidate Rules
At this point you will start mining rules from the application artifacts mapped to the scope of the business processes identified in previous steps. Rule mining tools help you assure that excluded artifacts are not included in scope by enabling the organization of an application into sub-groupings. Rules will be mined for a sub-grouping and not for the entire application.
The specific rule mining approach taken will primarily be driven by your application patterns and desired output:
A top-down, or process-oriented, approach starts from an examination of the user interface in an online application or from the job flow in a batch application.
In an online application, a transaction may be invoked by a user selecting a menu option or entering a value to the screen. Identify the fields that define the message or event that is sent from the screen to the interfacing application. Each field in the triggering message may be considered a seed field for rule mining. Then, using a seed field as a starting point, document all of the downstream data impacts to the field, including all conditional permutations. Each data transformation (move into another field or calculation) represents a candidate business rule to be captured.
Rule mining technologies will assist you in this task by visualizing a data impact path forward for each seed field to each point where it is either populated by new values or used as input to other fields via comparisons, value propagation, and calculations. At each such point, you can use the tool to document the underlying business rules. Automated rule detection methods can also be applied to capture each screen field edit as a candidate rule.
In a batch application, the concept is similar. Identify part of your job flow — e.g., JCL or group thereof that realizes a business process — and mine all rules within individual programs relevant to that process.
Figure 4. Top-Down Rule Mining
In Figure 4, note the format of the resulting "Derived Candidate Rules." Automatically detected from a COBOL program, they resemble its constructs, with variable names replaced by the Glossary definitions. These will later undergo review and transformation to a more businesslike form. While this example is for a COBOL application, advanced mining tools may apply to a broad array of languages from PL/I and Natural to Visual Basic and Java.
A bottom-up, or data-oriented, approach starts from an examination of system outputs — data sent to files (both batch and online), screens, and output messages (online only).
Following this approach, capture rules by starting from an interesting data point and identifying all logic impacting that point. For example, an Order Discount field is impacted by discounts calculated upstream from it, depending on the customer's location of residence.
Figure 5. Bottom-Up Rule Mining
Rule mining technologies are particularly well suited to this approach. Through visual inspection or a repository query, you can quickly identify data outputs of interest. Then, automated rule detection routines are able to capture a candidate rule for each statement that impacts the point of interest. Because of the pre-organization into contextualized sub-groupings mentioned above, the search results will be constrained by the subset of business processes deemed relevant for rule mining.
Inspection of relevant DBMS tables may also produce rules embedded in keys and any data rules for referential integrity and value constraints. Once you have covered all data points of interest, you have essentially mined all application logic of interest, oriented toward existing outputs.
A hybrid approach combines the two approaches described above:
- Start with a top-down oriented capture of the relevant transactions;
- For each transaction, perform bottom-up rule mining, only for data outputs that have not yet been already mined for another transaction.
The benefit of this approach is to extend the coverage of rule mining while avoiding repetition.
Relating to the examples shown in Figures 4 and 5, following a strictly top-down approach resulted in repetitive efforts for the Quantity and Price fields since they both traversed identical downstream data impacts. Coverage was also partial since not all of the rules for Customer Discount were discovered.
Let's consider an extended case involving both Order Entry and Proposal Issuance processes. Adopting an exclusive bottom-up approach would have also resulted in repetition, mining rules for upstream data impacts that "hit" multiple outputs (e.g., customer discount rules). Using the hybrid approach, we would first mine rules from all outputs of the order transaction, and then only outputs of the proposal transaction particular to it.
Figure 6. Hybrid Rule Mining
10. Verify Candidate Rules
At this point, after mining candidate rules from your application, verification and correction is a necessary step to ensure the correctness and completeness of the rules. Examine the candidate rules for:
Does each rule correctly reflect the underlying application behavior? If you have used automated rule detection technology, a rule at the point of interest (seed field) will be preceded by rules upstream from it, possibly with triggers, control conditions, and automatic rule set groupings. Review each one of them (or a chosen subset) for accuracy and make corrections where needed, until you are satisfied with the results.
Does a rule or rule set appear twice for the same application process? This can occur when rules are mined separately for two separate outputs that share upstream functionality. Or it can be a result of simple oversight like multiple team members inadvertently mining rules from the same code base. Use a rule attribute to mark duplication.
Another form of redundancy occurs when semantically identical rules were mined separately and with different names from different processes (e.g., Order Detail and Proposal). This will be dealt with in the next step, when you transform candidate rules to business rules format.
Beyond predefined exceptions, have you covered all of the application functionality? A rule coverage report, matching mined rule sources to overall sources, can provide the answer.
Can each mined rule be considered a candidate business rule? Note that although this is not yet the SME review step, there may be certain constructs that, upon inspection, are clearly irrelevant and should not be included in the scope of rules for review. Security verification rules, housekeeping routines, and out-of-scope operations may all fall into this category. Indicate relevance on one of the rule attributes.
11. Transform Candidate Rules to Business Rules Format
In the previous steps, you have mined and reviewed candidate rules, reflecting legacy application behavior. These rules closely follow the application's procedural flow and operations.
A transformational step is now required, to convert candidate rules to actual business rules ready for review. This step is conducted either by application experts, rule architects, or subject matter experts. After review and conversion, we will have captured the business rules that reflect the current, as-is state to serve as a baseline or comparison to the target environment.
Reformat to Business Rule Notation
If you have constructed your candidate rules manually, they may already be in the chosen business rules format. In other cases, they may have been captured in a technical format (like cut and paste from source code) and will require some modification and regrouping.
If you used an automated rule detection tool, the resulting candidate rules may somewhat resemble business rules, by using the glossary definitions to place business names within rule names, data elements, and controlling conditions. However, even after the rule verification step, most of the approved candidate rules will need to be adapted to conform to your chosen business rule notation.
Figure 7. Business Rule Transformation
Fact Modeling and Rule Normalization
Due to their procedural nature, legacy applications tend to lock business logic into process-specific silos. However, true business rules are independent of process and should be maintained as such.
In our example, rules for the Order Detail Entry and Proposal Entry events have been separately mined and placed in Rule sets. Are they all unique? Upon further examination, most of the logic in them is identical by design. Analyzing the results from a business perspective, there is commonality between portions of any customer document — whether Order or Proposal.
From a tooling perspective, it is at this point where it makes sense to switch over to a Business Rule Management System (BRMS), importing the mined rules from the rule mining tool as described in the Integration section below.
Using a BRMS or a visual modeling tool, construct a Fact Model reflecting the significant business entities and their interrelationships discovered in your existing applications. These will link to your mined business rules and serve as a baseline for the to-be rules model.
In our example, part of the Fact Model would be:
Once this is done, normalize the business rules to represent the desired business level semantics:
…whereby the Customer Document Handling rules apply to both Orders and Proposals.
Grouping and Sequencing
At this point, consider the generated rule grouping and sequencing. One point of attention is the triggering relationships between rules and other rules and rule sets. Since candidate rules are often derived from a 3rd generation language application (like COBOL or PL/I), they are automatically sequenced in a procedural manner. Transformation to a declarative mode will eliminate procedural elements that are non-business in nature.
As shown in Figure 8, declarative relationships that reflect true business requirements will be modeled as triggers between rules and other rules or rule sets. In the majority of BRMS environments, a single rule may trigger multiple rules and rule sets, where the sequencing of each triggered rule or rule set is pre-compiled or resolved only at runtime.
Figure 8. As-is Business Rule Model (Event-driven)
Subject Matter Expert Review and Approval
Once mined rules have been transformed into business rules, they are handed over to subject matter experts (SMEs) and/or business analysts for review and approval.
Normally, SMEs will not make major changes to the rules at this point. Rule mining tools may include rule attribution capabilities to aid the SMEs and enable them to mark up the business rules as:
- Approved or rejected;
- Reclassified to another category;
- Annotated with additional information in textual description attributes.
Rule mining tools also often offer web portals with a functional focus on predefined SME activities. This can greatly accelerate the review and approval process.
12. Produce Reports
Business rule reports are created in either hardcopy or digital formats. Rule mining tools produce reports and diagrams depicting detailed or summary rule information within your chosen context: hierarchy level, grouping, search result. These reports serve both as reference in the review steps and as documentation of record.
13. Integrate with a Target Environment
Depending upon your modernization strategy, integration requirements with other environments will vary.
Business Rule Development
Redevelopment with a business rule approach will typically leverage a BRMS authoring environment. These tools typically include XML import capabilities. Use them to set up an "as-is" business rule space, allowing rule developers to selectively re-use candidate rules deemed relevant for the target environment. By doing so, you will provide valuable (and sometimes crucial) traceability from newly-deployed rules back to their legacy origins.
This approach typically involves building Java and .NET applications with comprehensive developer toolkits. Often, UML models will be used to define logical application views prior to actual code generation.
In these environments, mined business rules can be attached as behaviors in UML classes that leverage them. For example, an Order_Invoice class including Order_Discount as an attribute may also include Calculate_Order_Discount as a class behavior. This behavior can be derived (and potentially imported) from the mined business rule performing the same function.
Business Process Management (BPM) tools enable process model creation and linkage to underlying rules and executable services. They also include the ability to define workflow rules (using BPEL) to govern the manual and automated transitions between activity nodes.
In this context, each activity node may be realized by business rules. Many-to-many relationships may exist between rule sets and supported activities. Populating BPM processes with their relevant mined "as-is" business rules can "close the loop" for business analysts and significantly advance IT / business alignment goals.
Vendor offerings also include requirement tools that enable the definition of high level use cases, detailed flowcharts, and activities for effective application development and management. In these environments, mined business rules can be imported and attached as either core requirements or as textual annotations to activity nodes.
Service enablement of existing applications involves code refactoring and deployment as service capsules. The rule mining step can be invaluable in locating fine-grained services within source code and serving up the required service components. For example, the results of automatic rule detection for the calculation of an order discount will include all code segments leading up to the final calculation. By creating a component slice with that code (and its dependents) only, you can redeploy the order discount calculation as a service.
14. Manage Rules Proactively
We expect most applications to continue and be maintained for many years into the future. Having had rules mined from them, it is crucial that they continue to be updated and kept in synch with future application changes. Rule mining tools offer maintenance and management capabilities, including:
- Automatic alignment of rules with their original code segments even when they have moved as a result of overall source code changes;
- Audit trails for manual rule changes;
- "Changer" routines allowing for individual or mass changes, post- rule mining.
A well-defined approach to business rule mining will allow for business contextualization early on in the process. Not only will the contextualization step help frame your rules correctly, it will also reduce the rule mining investment to focus only on critical and dynamic business processes of interest. Regardless of your application modernization strategy, the best practices and tool-assisted approaches described here will help you achieve your goals at a lower cost, with less repetition and higher quality results.
 For the Business Rules Manifesto, see http://www.businessrulesgroup.org/brmanifesto.htm.
SBVR (Semantics of Business Vocabulary and Business Rules), adopted by OMG in 2005, is a metamodel for developing semantic models of business vocabularies and business rules. See http://www.omg.org/technology/documents/domain_spec_catalog.htm#SBVR. Also see OMG's PRR (Production Rule Representation) specification at http://www.omg.org/technology/documents/br_pm_spec_catalog.htm#PRR.
# # #