Use the Right Tool for your Job

Silvie   Spreeuwenberg
Silvie Spreeuwenberg Founder / Director, LibRT Read Author Bio || Read All Articles by Silvie Spreeuwenberg

Twenty years ago we would have said that the job of a Business Analyst consists of solving problems and eliciting knowledge.  Of course this sounds very old-fashioned today.  Today we face challenges and gather functional requirements.

What we lost along the way is this thick book called "knowledge representation," containing at least 10 different strategies for formulating different kinds of knowledge.  Today all kinds of knowledge are canned into the one strategy that the Business Analyst is familiar with, that being:

  • Functional requirements in plain natural language
  • Business rules in RuleSpeak or pseudo code
  • Decisions as decision tables or decision trees

Many Business Analysts are unaware of the different kinds of knowledge and what representation strategy is most successful for the particular kind of knowledge they are dealing with.  The result is like using a hammer to insert a screw — the wrong tool for the job.

In this column I will first show how painful it is to use the wrong representation method for a certain kind of knowledge.  Then I will give a simple strategy that avoids such problems in common situations.

Consider this case

Suppose you are working on a dating site that matches partners.  The goal is to create an automated match between people based on heuristics.  You are experienced using decision trees and have some experienced dating consultants that you can interview.  After a couple of interview sessions you end up with the following decision tree:

Figure 1.  Decision tree for matching dating partners  

The tree is big and therefore difficult to read and contains a lot of duplication.  Suppose new interests need to be incorporated — it will take considerable time to rework the tree.  Why?  Because this is a 'matching problem' and decision trees/tables are not the right knowledge representation method for that kind of knowledge.

In the research on knowledge representation, different kinds of problems have been identified in the fields of cognitive science and artificial intelligence.  The most common are classification, matching, and calculations.  Since a Business Analyst is likely to come across cases of each of these in a project, it is good to know which knowledge representation method is the best fit.

Classification

Let's start with classification.  In simple classification you want to assign some input to one of two categories.  A typical example is a loan request that is either eligible or not eligible.  The characteristics of the request are compared to predefined criteria, which can typically be ordered in such a way that the classification can be completed by assessing the fewest criteria.

Start by listing all characteristics of a loan request and indicate how they relate to the result.  In a loan request you would typically list age, income, and family situation.  You would then relate these to results, as in the following table:

Age

Income

Family Situation

> = 18 … < 65

> 20,000

no children

> = 18 … < 65

> 30,000

single parent

> = 18 … < 65

> 25,000

traditional

>= 65

> 40,000

The result is a lookup table.  It is not complete so it is not yet a good decision table.  Use techniques from decision table analysis to complete, compress, and order the table.  Eventually present it as a decision tree.

Figure 2.  Decision tree for classifying a loan request  

In this representation we can easily relate criteria to characteristics of an object (the object to be classified).  Also, the criteria are ordered, which may reflect knowledge about the population so as to minimize the number of evaluations needed to reach a conclusion.  Decision tables and trees are in general a good representation for classification type problems.  If you need to group input into more than 2 categories you are dealing with categorization and you can use the same strategy.

One caveat with a decision table is that it does not focus on terminology — hence you might forget to define it properly.  Make sure you define your terms and (preferably) connect them in a conceptual model.

Matching

For matching, you typically have a description of at least two things and you want to indicate which pairs are a good match.  Besides the dating example discussed earlier, consider one of finding the best candidate for a job position (recruitment) or even finding the best solution to a problem among a list of pre-defined solutions.

Start by listing characteristics of a good match.  In a date match you would typically list age, gender, interests, and education.  For each characteristic indicate how important it is for a match.  Start with a three-point scale (very important, important, less important) and extend when you feel a need to distinguish on a finer scale.

For this example the result can be represented in a table:

Characteristic

Priority

Score

age

very important

10

income

important

5

gender

very important

5

education

less important

1

interest

less important

1

Describe a heuristic for each characteristic.  You can do this in plain natural language or use a pattern sentence.  In our dating match example, based on the pattern sentence A <client characteristic> may match with a candidate that < candidate characteristic>, you might consider the following heuristics:

Age:

A middle-aged client may match with a middle-aged candidate.

Age:

A middle-aged male client may match with a female client up to 10 years younger.

Interest:

A client who is allergic to animals may match with a candidate that does not like animals.

Interest:

A client that likes sport may match with a candidate that likes to travel.

To rank your matches you simply count the number of heuristics that match for a couple and divide by the total  number of potential matches to get a number scaled between 0 and 1.  We can say this in RuleSpeak in the following way:

The matching score for a couple must be calculated as (A / B) * 100 where:
   A = the sum of the scores for all matching heuristics
   B = the sum of all scores for all heuristics

Calculations

For calculations, you typically have a description of one or more things, some computations and constraints, and a quantifiable result.  For our computations we like to use simple mathematical expressions of the form (A + B) / C.  It is important to clearly indicate what your variables mean.  Use a combination of a RuleSpeak pattern and mathematics the way I did with the match score to get the best results, like this:

The matching score for a couple must be calculated as (A / B) * 100 where:
   A = the sum of the scores for all matching heuristics
   B = the sum of all scores for all heuristics

Very often computation rules also provide constraints on what to count or what the criteria are.  Instead of using the difficult-to-read mathematical summation symbol (sigma), you can again use a combination of natural language and simple mathematics, as in the following example:

The yearly income of a consultant must be calculated as the sum of H * T for each month in the year where:
    H = the number of hours that the consultant worked in the month
    T = the agreed tariff for the consultant in the month
if the number of hours in the month is greater than 10.

Want more?

Be aware that in most domains you have to deal with multiple strategies.  For example, after an eligibility assessment (classification) you need to find the best product (matching) and then create a quote (calculation).  Also, be aware that the strategies described so far are not complete — imagine describing the knowledge of a chess player using decision trees — google the "handbook of knowledge representation" to get a broader range of details. 

For your convenience I provide you with a table that summerizes the three categories this article has discussed.  There may be exceptions to the recommendations but following them will prevent quality issues in most situations.

Problem Type

Input

Output

Recommended
Representation

Classification

Data about one thing

Two or more distinct categories

Tables and trees

Matching

Data about multiple things

A list of matching pairs, eventually sorted by best match

Rules in natural language and tables

Calculation

Data about one or more things

A number

Rules in natural language

# # #

Standard citation for this article:


citations icon
Silvie Spreeuwenberg , "Use the Right Tool for your Job" Business Rules Journal Vol. 12, No. 11, (Nov. 2011)
URL: http://www.brcommunity.com/a2011/b624.html

About our Contributor:


Silvie   Spreeuwenberg
Silvie Spreeuwenberg Founder / Director, LibRT

Silvie Spreeuwenberg has a background in artificial intelligence and is the co-founder and director of LibRT. With LibRT, she helps clients draft business rules in the most efficient and effective way possible. Her clients are characterized by a need for agility and excellence in executing their unique business strategy or policy. Silvie's experience has resulted in the development of tools and techniques to increase the quality of business rules. She writes, "We believe that one should focus on quality management of business rules to make full profit of the business rules approach." LibRT is located in the Netherlands; for more information visit www.silviespreeuwenberg.com & www.librt.com

Read All Articles by Silvie Spreeuwenberg
Subscribe to the eBRJ Newsletter
In The Spotlight
 Silvie  Spreeuwenberg
 Ronald G. Ross

Online Interactive Training Series

In response to a great many requests, Business Rule Solutions now offers at-a-distance learning options. No travel, no backlogs, no hassles. Same great instructors, but with schedules, content and pricing designed to meet the special needs of busy professionals.