Writing Natural Language Rule Statements — a Systematic Approach Part 4 — Some Data Content Rules

Graham   Witt
Graham Witt Consultant / Author, Read Author Bio || Read All Articles by Graham Witt
About this series of articles

While my first series of articles on writing natural language rule statements[1] explored a wide variety of issues in a rather organic and hence random manner, this series takes a more holistic and systematic approach and draws on insights gained while writing my recently-published book on the same topic.[2]  Rule statements recommended in these articles are intended to comply with the Object Management Group's Semantics of Business Vocabulary and Business Rules (SBVR) version 1.0.[3]

The story so far

In the previous article[4] we have looked at a variety of data cardinality rule statements:

  1. mandatory data item rule statements, i.e., statements of mandatory data item rules (rules that require that a particular data item be present) such as RS16,

  2. mandatory option selection rule statements, i.e., statements of mandatory option selection rules (rules that require that one of a set of pre-defined options be specified) such as RS21,

  3. mandatory group rule statements, i.e., statements of mandatory group rules (rules that require that at least one of a group of data items be present) such as RS43,

  4. prohibited data rule statements, i.e., statements of prohibited data rules (rules that mandate the absence of some data item in a particular situation) such as RS49,

  5. maximum cardinality rule statements, i.e., statements of maximum cardinality rules (rules that place an upper limit (usually but not necessarily one) on how many instances of a particular data item there may be) such as RS51,

  6. multiple data rule statements, i.e., statements of multiple data rules (rules that mandate the presence of two or more instances of a particular data item in a particular situation) such as RS53, and

  7. dependent cardinality rule statements, i.e., statements of dependent cardinality rules (rules that mandate how many of a particular data item must be present based on the value of another data item) such as RS55.
RS16. Each Travel Insurance Application must specify exactly one Departure Date.[5]
RS21. Each Travel Insurance Application must specify whether it is for international travel or for domestic travel.
RS43. Each Travel Insurance Application must specify a Return Date or a Travel Duration but not both.
RS49. A Travel Insurance Application that is for domestic travel must not specify a Region.
RS51. A Travel Insurance Application must not specify more than one Frequent Flier Membership for any one Passenger.
RS53. Each Flight Booking Confirmation that is for a return journey or for a multi-stop journey must specify at least two Flights.
RS55. The number of Passenger Names specified in each Flight Booking Confirmation
must be equal to the Number of Passengers
    specified in the Flight Booking Request
    that
gives rise to that Flight Booking Confirmation.

In those articles I showed that each of these types of rule statement has a common formulation, in that all rule statements of a particular type can be written using a common sentence structure.

Data content rules

Data content rules — as their name suggests — govern the content of data items.  What constraints might there be on the content of data items in a typical travel insurance application?

  1. The dates of departure and return must both be valid dates in the future; the date of return must be later than the date of departure.

  2. If travel duration is specified rather than the date of return, that duration must be a positive number of days.

  3. Any names of regions or countries listed as to be visited must be limited to recognised region or country names rather than being unconstrained "free text" (which would prevent automatic price quotation).

  4. Passengers' salutations or titles (e.g., Mr, Ms, Dr) might be free text or might be limited to a list of acceptable salutations.

  5. Passengers' family and given names and the cardholder name should be free text as there is considerable diversity in personal names.

  6. Passengers' dates of birth must all be valid dates earlier than the date of departure; some insurance companies will not insure passengers over 80 years of age so, for those companies, passengers' dates of birth must all be later than 80 years before the date of return.

  7. If passengers' ages as at the date of return are specified, these must all be positive numbers of years and may also be required to be less than 80.

  8. Names of passengers' pre-existing medical conditions might be free text or might be limited to a list of names of high-risk conditions.

  9. The street number and name in the postal address should be free text unless the company has an up-to-date gazetteer of street numbers and names in the country in which it operates.[6]

  10. The locality name, state, and postal code in the postal address should be limited to a list of valid locality name/state/postal code combinations provided by the postal authority of the country in which the insurance company operates.

  11. The contact phone number should conform to whatever rules govern phone numbers in the country in which the insurance company operates: these rules typically govern the number of digits and may also limit certain places in the number to certain numerals, e.g., all personal phone numbers in Australia — mobile (cell) or landline — are 10 digits in length, the first digit of each personal phone number is 0, while the second digit is (currently) 2, 3, 4, 7, or 8.

  12. The email address must be valid.[7]

  13. The credit card issuer might be free text or might be limited to a list of recognised issuers (e.g., Visa, Mastercard).

  14. The credit card number must contain only digits and be from 12 to 19 digits long.

  15. The credit card expiry date must be in the future.

  16. Descriptions of high-value items should be free text.

  17. Values of high-value items should be whole numbers no less than the threshold below which insurance cover is automatic (e.g., $500).

These constraints are of a number of different types:

  1. Dates, the travel duration (where used), passengers' ages (where used), the contact phone number, the contact email address, the credit card number, and values of high-value items must each be in the correct format.  These require data item format rules, which I shall discuss in a future article.

  2. Dates of departure and return and the credit card expiry date must be later than the date of application.  Also, the return date must be later than the departure date, and passengers' dates of birth must be earlier than the date of departure and (in some cases) later than 80 years before the date of return.  These require range rules, discussed later in this article.

  3. Passengers' ages (where used) must (in some cases) be less than 80 years while values of high-value items must be at least $500 (for example).  These also require range rules.

  4. Region or country names, passengers' salutations or titles, medical condition names, and the credit card issuer name must each be limited to a list of valid values.  These require value set rules, which I shall discuss in the next article.

  5. The locality name, state, and postal code in the postal address must be limited to a list of valid combinations, as must the street number and name (if the company has an up-to-date gazetteer).  These also require value set rules.

Range rules

A range rule requires that the content of a data item be a value within a particular range.  That data item may be a date or a simple numeric value.  Let us consider numeric values first.  We have seen that an insurance company may require that passengers' ages be less than 80 years (as stated in RS56) and high-value item values be at least $500 (as stated in RS57).

RS56. The Age specified for each Passenger in each Travel Insurance Application must be less than 80 years.
RS57. The Value specified for each High Value Item (if any) in each Travel Insurance Application must be more than $500.

Where the governed data item is a date, the rule statement has the same form but uses the verb phrases be earlier than or be later than rather than be less than or be more than.  We have seen that:

  1. dates of departure and return and the credit card expiry date must be later than the date of application (as stated in RS58, RS59, and RS60);

  2. the return date must be later than the departure date (as stated in RS61);

  3. dates of birth must be earlier than the date of departure (as stated in RS62);

  4. (in some cases) dates of birth must be later than 80 years before the date of return (as stated in RS63).

In each example range rule statement governing a simple numeric value, the range has a fixed upper or lower bound (80 years in RS56, $500 in RS57).  However, range rule statements governing dates most often limit those dates to being earlier than or later than a variable date (defined elsewhere in the transaction) rather than a fixed date.  There can be exceptions:  the range of a numeric value may be bounded by the value of another data item, and the range of a date may have a fixed bound (this is often the case in taxation rules, for example, where particular allowances or rates apply only after or before some date).

RS58. The Departure Date specified in each Travel Insurance Application must be later than the Date on which that Travel Insurance Application is made.
RS59. The Return Date specified in each Travel Insurance Application must be later than the Date on which that Travel Insurance Application is made.
RS60. The Expiry Date specified for the Credit Card in each Travel Insurance Application must be later than the Date on which that Travel Insurance Application is made.
RS61. The Return Date specified in each Travel Insurance Application must be later than the Departure Date specified in that Travel Insurance Application.
RS62. The Birth Date specified for each Passenger in each Travel Insurance Application must be earlier than the Departure Date specified in that Travel Insurance Application.
RS63. The Birth Date specified for each Passenger in each Travel Insurance Application must be later than 80 years before the Return Date specified in that Travel Insurance Application.

Like other types of rule statement, range rule statements have a common formulation:

  1. the subject, identifying the governed data item, such as The Age or The Departure Date

  2. specified

  3. if the governed data item is part of a complex data item, a phrase having the following form:

    1. for the (if there can only be one of the complex data item) or for each (if there can be more than one of the complex data item)

    2. the name of the complex data item

    3. (if any) if the complex data item is optional

  4. a qualifying clause identifying the type of transaction, in this case in each Travel Insurance Application

  5. must, followed by a range predicate.

Range predicates

These vary with respect to a number of factors all illustrated in the example rule statements above:

  1. The governed data item may be a simple numeric value (as in RS56 and RS57) or a date (as in RS58 to RS63 inclusive).

  2. That bound may be an upper bound (i.e., a maximum, as in RS56 and RS62) or a lower bound (i.e., a minimum, as in RS57 to RS61 inclusive and RS63).

  3. The range may have a fixed bound (as in RS56 and RS57) or a variable bound (as in RS58 to RS63 inclusive).

  4. If the bound is variable, it may be able to be represented by a simple term (such as Date in RS58 to RS60 and Departure Date in RS61 and RS62) or a term qualified by an offset (such as 80 years before the Return Date in RS63).

Despite this diversity, range predicates have a common formulation:

  1. an inequality operator:

    1. where the governed data item is a simple numeric value (rather than a date), be less than, be more than, be no less than (or be at least), or be no more than (or be at most);

    2. where the governed data item is a date, be earlier than, be later than, be no earlier than (or be at the earliest), or be no later than (or be at the latest);

  2. a statement of the bound:

    1. if the range has a fixed bound, a literal, such as 80 years or $500;

    2. if the range has a variable bound:
      1. an optional offset, such as 80 years before in RS63;

      2. the name of the governing data item preceded by the, such as the Date or the Return Date;

      3. a qualifying clause (always present so as to relate the governed and governing data items), such as on which that Travel Insurance Application is made or specified in that Travel Insurance Application.

To be continued...
The next article in this series will discuss some other types of data content rule statement.

References

[1]  The first of which is:  Graham Witt, "A Practical Method of Developing Natural Language Rule Statements (Part 1)," Business Rules Journal, Vol. 10, No. 2 (Feb. 2009), URL:  http://www.BRCommunity.com/a2009/b461.html  return to article

[2]  Graham Witt, Writing Effective Business Rules.  Morgan Kaufmann (2012).  return to article

[3]  Semantics of Business Vocabulary and Business Rules (SBVR), v1.0.  Object Management Group (Jan. 2008).  Available at http://www.omg.org/spec/SBVR/1.0/  return to article

[4]  Graham Witt, "Writing Natural Language Rule Statements — a Systematic Approach:  Part 2 —Mandatory Data Rules," Business Rules Journal, Vol. 13, No. 8 (Aug. 2012), URL:  http://www.BRCommunity.com/a2012/b665.html and
Graham Witt, "Writing Natural Language Rule Statements — a Systematic Approach:  Part 3 — Other Data Cardinality Rules," Business Rules Journal, Vol. 13, No. 9 (Sept. 2012), URL:  http://www.BRCommunity.com/a2012/b669.html  return to article

[5]  The font and colour conventions used in these rule statements reflect those in the SBVR, namely underlined teal for terms, italic blue for verb phrases, orange for keywords, and double-underlined green for names and other literals.  Note that, for clarity, these conventions are not used for rule statements that exhibit one or more non-recommended characteristics.  return to article

[6]  Typically insurers only operate within one country, thus a country name or code is not required.  For this example, I am assuming an insurer operating within Australia or the US, in which the State must be included in each address.  return to article

[7]  The syntax rules for email addresses are quite complex:  see http://en.wikipedia.org/wiki/Email_address.  return to article

# # #

Standard citation for this article:


citations icon
Graham Witt, "Writing Natural Language Rule Statements — a Systematic Approach Part 4 — Some Data Content Rules" Business Rules Journal, Vol. 13, No. 10, (Oct. 2012)
URL: http://www.brcommunity.com/a2012/b674.html

About our Contributor:


Graham   Witt
Graham Witt Consultant / Author,

Graham Witt has over 30 years of experience in assisting organisations to acquire relevant and effective IT solutions. NSW clients include the Department of Lands, Sydney Water, and WorkCover while Victorian clients include the Departments of Sustainability & Environment, Education & Early Childhood Development, and Human Services. Graham previously headed the information management and business rules practice in Ajilon's Sydney (Australia) office.

Graham has developed specialist expertise in business requirements, architectures, information management, user interface design, data modelling, relational database design, data quality, business rules, and the use of metadata repositories & CASE tools. He has also provided data modelling, database design, and business rules training to various clients including NAB, Telstra, British Columbia Government, and ASIC and in the form of public courses run by Simsion Bowles and Associates (Australia) and DebTech (USA).

He is the co-author, with Graeme Simsion, of the widely-used textbook "Data Modeling Essentials" and is the author of the newly published book, "Writing Effective Business Rules" (published by Elsevier). Graham has presented at conferences in Australia, the US, the UK, and France. Contact him at gwitt@pacific.net.au.

Read All Articles by Graham Witt

Online Interactive Training Series

In response to a great many requests, Business Rule Solutions now offers at-a-distance learning options. No travel, no backlogs, no hassles. Same great instructors, but with schedules, content and pricing designed to meet the special needs of busy professionals.