Comparing Apples to Apples ~ User Defined Types
The relational database has been one of the most successful technologies deployed by Information Technology. Each year, billions of dollars are spent on relational database systems and applications. Much of the world’s business data is stored in relational form. When IBM, Oracle, Informix, and other database vendors released object relational versions of their respective databases, they added to the types of business rules that can be actively enforced by the database.
Many people still limit their understanding of business rules and database technology to mean "Referential Integrity." Today’s relational and object relational databases support a multitude of ways to actively enforce an application’s business rules. In this article, we’ll look at how User Defined data Types (UDT) can be used to actively enforce a business rule.
It is the norm for most of individual data items stored in databases to be relatively small and simple. In storing these simple data items, database systems support a set of predefined data types such as integers, real numbers, and character strings. The operations defined over these data types, such as arithmetic and comparison, are also simple and predefined.
Partially because of the Internet, many organizations need to store and manipulate objects that are neither small nor simple, and to perform operations on these objects that are not predefined. For example, an Internet retailer might need to store maps, photographs, written documents with diagrams, audio and video recordings. A city planner may need to find the parcels of property that intersect a proposed highway or to find the minutes of all meetings in which construction of new schools was discussed. Multimedia applications are among the fastest-growing segments of the database industry. Because of the very large amounts of data that they require, we can expect the requirements of these applications to become increasingly important.
The traditional data types and search capabilities of SQL are usually not sufficient for multimedia database applications. It is apparent that the requirements of these applications are so diverse that they could not be satisfied by a set of predefined language extensions. This is why ANSI’s SQL standards committee and database vendors have added facilities to let users define new data types and functions of their own.
The built-in SQL data types such as Number, Decimal and Double are often used to represent data that has a specialized meaning. For example, a column defined with a Decimal data type may contain amounts in US Dollars. Another column defined with a Decimal column may contain amounts in Japanese Yen. Similarly, a column defined with a Double (floating point) data type may contain measurements in feet, and another column with the same data type may contain measurements in meters.
The database designer may have some rules in mind about the kinds of operations that make sense on various types of data. For example, it may make sense to add two US Dollar amounts, but it is probably an error to add a US Dollar amount to a Yen amount. It may make sense to multiply a dollar amount by an integer or floating point number, but it is probably an error to multiply a US Dollar amount by another US Dollar amount.
If you use only the built-in data types, a database has no way of knowing what specialized rules may apply to the data. Therefore, many databases allow you to create new data types of your own and specify the operations that apply to them. In most databases, a user-defined data type is called a "type." Each type shares a common representation with one of the built-in types, called its "base type," but it has its own set of valid operations.
The following statements create two types named DOLLARS and YEN, each based on the built-in Decimal type. The phrase WITH COMPARISONS is a DB2 UDB extension that means that two dollar values may be compared, and that two Yen values may be compared. However, the database can prevent a dollar value from successfully being compared with a Yen value or with an ordinary Decimal value.
The ANSI SQL99 (SQL3) syntax for creating a distinct type is:
When you create a distinct type, DB2 UDB generates casting functions that convert a value between the distinct type and its base type. For example, creation of theDOLLARS type causes creation of the cast functions DOLLARS(DECIMAL), which returns DOLLARS, and DECIMAL(DOLLARS), which returns DECIMAL(10,2).
When a distinct type is first created in DB2 UDB, the only operators that apply to it are comparisons between two values of the same distinct type. For example, ifSALARY and BONUS are data columns of type DOLLARS, then SALARY = BONUS and SALARY > BONUS are valid predicates, but SALARY + BONUS and SALARY * BONUS are not valid because no arithmetic operations have been defined on the DOLLARS type.
It is easy to specify which of the operators of the base type are meaningful for the distinct type. Each built-in operator such as "+" is implemented by a function with the same name as the operator. To make an operator apply to the distinct type, you can create a new function with the same name as the operator that accepts parameters and/or returns results of your distinct type. The operator function can be "sourced" on the built-in operator function provided by DB2 UDB, which means that it relies on the DB2 UDB function for its implementation. The following statements define the operator "+" for two dollar values, and the operator "*" for an integer and a dollar value:
After executing these statements, ifSALARY and BONUS are columns of type DOLLARS, then SALARY + BONUS and 2 * SALARY are valid, but SALARY * BONUS is invalid because no multiplication operator has been defined between two dollar values.
By creating sourced functions, you can specify exactly the set of operators that are meaningful for the distinct type. You can then rely on the database to catch any type-related errors in a query or application program.
Of course, you may wish the distinct type to have some additional functionality beyond the operators defined for its base type. For example, you might create a distinct type namedADDRESS, based on the built-in type VARCHAR(50). You might then create a user-defined function (UDF) TIMEZONE(ADDRESS) that computes the time zone of a given address. Depending on the database, the UDF TIMEZONE function would be written in a proprietary language or a commercial language like Java and C++. Typically, UDFs need to be registered in the database in which they will be used.
In an object-relational system, it is important to capture not only the state of stored objects but their behavior as well. The behavior of a distinct type is captured by the functions that are defined on that type. For example, unary functions such asTIMEZONE(ADDRESS) and ZIPCODE(ADDRESS) might be considered to be "methods" that define the behavior of the ADDRESS type. Therefore, it’s possible to enforce the integrity of the data and the behavior of the data as a resource that can be shared by multiple applications.
Comparing Apples to Apples
With the use of User Defined Types and Functions, we can place certain business rules directly into the database. In the above example, we have not set up any UDFs to handle the division and equals operation. The database can then enforce how apples and oranges are manipulated, and if it’s legal to compare apples.
# # #